Systematically sweep one or more reference books to find recipes, ingredients, or compounds not yet in the Flavors vault, then write quality-gated notes for each new find. Trigger phrases: scrape all books, scan books for new recipes, find new ingredients in books, sweep library for missing content, what recipes aren't in the vault.
Systematic extraction of new content from the reference library into the Flavors vault. Covers the full sweep cycle: find → dedup → resolve → quality-gate → write → log.
Use this when the goal is coverage — finding things in the books that are missing from the vault.
For looking up a specific subject, use book-lookup instead.
cd /home/mango/workspace/Obsidian/Flavors
# 1. Verify book_search is live
python3.11 -c "from book_search.store import BookStore; s=BookStore.load(); print([b for b in s.books])"
# Expected: ['cocktail_codex', 'death_co', 'liquid_intelligence', 'nose_dive', 'smugglers_cove']
# 2. Check ledger — confirm what is NOT yet done
# Read: Flavors/_ingestion/book-extraction-ledger.md
# Only target rows marked NOT STARTED or PARTIAL. Never re-run COMPLETE rows.
Read Flavors/_ingestion/book-extraction-ledger.md before choosing a target. This is the anti-duplication gate at the session level.
A book has a structured JSON recipe extract (json_recipes in books.yaml) and the ledger shows it as NOT STARTED or PARTIAL for type recipe.
Currently applies to: Cocktail Codex (54 recipes in JSON, not yet in vault).
The script pipeline produces skeleton notes, but raw ingredient strings from the book must be mapped to canonical vault names. This is the quality gate that separates script output from agent-quality output.
# See which CC ingredients are unresolved:
python3.11 -c "
import json
from flavor_graph.generate_recipe_notes import resolve_ingredient, MANUAL_RESOLUTION, EXACT_RESOLUTION
cc = json.load(open('Reference/Documents/scraped_recipes_cocktail_codex.json'))
all_ings = set()
for r in cc:
for ing in r['ingredients']:
all_ings.add(ing['raw_name'])
unresolved = [n for n in sorted(all_ings)
if n not in MANUAL_RESOLUTION
and n not in EXACT_RESOLUTION
and n.lower() not in {k.lower() for k in {**MANUAL_RESOLUTION, **EXACT_RESOLUTION}}]
print(f'Unresolved: {len(unresolved)}')
for u in unresolved:
print(f' {u!r}')
"
For each unresolved ingredient, determine the canonical vault note name:
obsidian_list_directory on Flavors/Ingredients/ subfolders for existing notes(this page) suffixes — these are house sub-recipes defined elsewhere in the book; map to the closest generic: "Cinnamon Syrup (this page)" → "Cinnamon syrup", "Demerara Gum Syrup (this page)" → "Demerara syrup""Elijah Craig Small Batch bourbon" → "Bourbon" (or keep brand if that specific product has a vault note)"1 " — these are pure garnishes; title-case and strip quantity: "1 lemon twist" → "Lemon twist"Add all mappings to MANUAL_RESOLUTION in flavor_graph/generate_recipe_notes.py before running generation. This is a code edit, not a runtime step.
# Normalize (if not already done)
python3.11 flavor_graph/scraper_normalize.py --source cocktail_codex
# Dry-run: preview what will be created vs. skipped (collision = already in vault)
python3.11 flavor_graph/generate_recipe_notes.py --source cocktail_codex --dry-run
# Generate
python3.11 flavor_graph/generate_recipe_notes.py --source cocktail_codex
The script handles dedup automatically: existing vault filenames get a (CC) suffix rather than being overwritten.
The generated notes have two unfilled fields:
## Flavor description: — <Cocktail flavor description for embedding — to be filled by LLM.>## Flavor profile — [[Flavor placeholder]]These require an agent, not the script. Spawn a simple-executor subagent per batch of 15 recipes with this instruction:
You are filling flavor descriptions for cocktail recipe notes in the Flavors vault.
Vault root: /home/mango/workspace/Obsidian
Flavors root: /home/mango/workspace/Obsidian/Flavors/
YOUR BATCH: [paste list of recipe note paths here]
For each recipe note:
1. Read the note with obsidian_read_note to get the ingredients list and existing Notes section.
2. Check the Notes section — if it contains the book's own tasting notes (Cocktail Codex
notes field), use that as primary source material.
3. Use book_search MCP to get the book's perspective on the recipe and its key ingredients:
- book_recipe_search(query="<recipe name>", limit=3) — find the structured recipe record
- book_search(query="<recipe name> flavor taste", limit=4) — find any page prose about it
- book_get_page(book_id, page, context_pages=1) for any strong hits
4. For each key ingredient in the recipe, optionally call:
- book_search(query="<ingredient> flavor taste aroma", limit=3) to understand its contribution
5. Write a 2-3 sentence flavor description covering: dominant character, balance between
components, and finish. Be specific and sensory — not generic. Draw from book language first.
6. Replace [[Flavor placeholder]] with 2-4 real [[Flavor Name]] wikilinks from
Flavors/Flavors/ that match the recipe's character. Check obsidian_list_directory
on Flavors/Flavors/ to confirm the flavor notes exist before linking. Prefer canonical bare Obsidian links like `[[Bitter Flavor]]`; only use folder-qualified targets when title collisions require disambiguation.
7. Change status: wip → status: complete if both fields are filled.
8. Use obsidian_patch_note for surgical edits — do not rewrite the whole note.
9. Validate any newly created ingredient or flavor note with `PYTHONPATH=. python3.11 run_checks.py --note-standards --paths "<note path>"` before finishing the batch.
INVARIANTS:
- Never rename notes or delete files
- Flavor wikilinks must resolve to existing *Flavor.md notes in Flavors/Flavors/
- Be specific: "dry sherry and Cognac backbone with orange curaçao sweetness and bitters structure"
beats "complex and well-balanced"
- Append one log entry to Flavors/_system/llm-activity-log.md covering the whole batch
Batch size: 15 recipes per subagent. Run batches in parallel.
Mark the row COMPLETE in _ingestion/book-extraction-ledger.md and add the chunk_id range.
A book has recipe content in book_cache/ but no structured JSON (json_recipes absent from books.yaml). The sweep must find and parse recipe pages from raw text.
Use book_search MCP with measurement-pattern queries, restricted to the target book:
book_search(query="ounce lime juice rum", book_id="target_book", limit=20)
book_search(query="dash bitters stir strain", book_id="target_book", limit=20)
Filter results to pages that:
1½ ounces, ¾ oz, 2 dashes)For each candidate page, call book_get_page(book_id, page, context_pages=1) to retrieve the full recipe text including any overflow onto the next page.
python3.11 -c "
import os
vault = set(f.lower().replace('.md','') for f in os.listdir('Recipes/'))
# For each candidate recipe name:
name = 'my recipe name'
key = (name + ' cocktail recipe').lower()
print('In vault:', key in vault)
"
Only proceed with candidates where the normalized name is not already in the vault.
For each new recipe page, extract:
name — first non-blank line of the recipe blockingredients — lines matching quantity+unit+name patternsgarnish — line starting with "Garnish:"method — the instruction paragraph after the ingredient listPass to generate_recipe_notes.create_recipe_note() directly:
from flavor_graph.generate_recipe_notes import create_recipe_note
from flavor_graph.scraper_normalize import parse_ingredient_string
from pathlib import Path
recipe = {
"name": "Recipe Name",
"source": "Book Title",
"page": 45,
"specs": {},
"method": "Stir and strain...",
"notes": "Tasting notes or description from book.",
"ingredients": [parse_ingredient_string(s) for s in ["1½ oz Rum", "¾ oz Lime juice"]],
}
create_recipe_note(recipe, Path("Recipes/"))
Then run the flavor description fill pass (same as Workflow A, Step 3).
Scanning for ingredient-specific flavor prose in non-recipe chapters of a book. The goal is to enrich existing thin ingredient notes, or discover ingredients not yet in the vault at all.
Use book_search_topic to find which pages contain ingredient-relevant prose for the target:
book_search_topic(topic="rum", limit=15) # → for Smugglers Cove rum chapters
book_search_topic(topic="citrus", limit=15) # → acid/zest flavor chemistry
book_search_topic(topic="bitters", limit=15) # → bitters ingredient descriptions
Filter to pages with sensory prose (taste, aroma, smell, finish, palate, mouthfeel). Discard pages that are purely historical, geographic, or production-technical with no sensory payoff.
For each ingredient found in a book page:
obsidian_list_directory on the expected Ingredients/ subfolder — exact filename checkobsidian_read_note — check if it already has a filled ## Flavor description: section. If yes, skip unless the book content adds something materially new.Only extract a page's ingredient description if it contains:
If all three conditions pass, the page content is a valid source. If not, skip it — don't pad thin notes with non-sensory prose.
New ingredient note: Use the flavor-bootstrap fill agent prompt template. Feed the book passage as the primary source, supplemented with web research if needed. Follow the ingredient fill prompt exactly (Flavors/Reference/Ingredient fillout prompt.md). Validate the created note with PYTHONPATH=. python3.11 run_checks.py --note-standards --paths "<note path>" before marking it complete.
Enriching an existing note: Use obsidian_patch_note to add or replace the ## Flavor description: section only. Do not rewrite other sections. Cite the book source inline.
Add covered page ranges (chunk_ids) to the "Processed Page Ranges" table in the ledger.
| Check | Method |
|---|---|
| Recipe already in vault | ls Recipes/ normalized filename match |
| Recipe variant (collision) | Script appends (CC) suffix automatically |
| Ingredient already exists | obsidian_list_directory on expected subfolder |
| Ingredient under different name | flavor_search(query=name) — if score > 0.85, alias rather than new note |
| Ingredient in different subfolder | Check all Ingredients/*/ not just the expected one |
book_cache/*.json files directly into agent context. Use MCP tools only.book_search call returns snippets ≤300 chars each — safe.book_get_page call returns ≤3 pages of text — safe._ingestion/book-extraction-ledger.md — mark rows complete, add chunk_id ranges_system/llm-activity-log.mdPYTHONPATH=. python3.11 run_checks.py --bidirectional --fix
PYTHONPATH=. python3.11 run_checks.py --name-drift --fix
PYTHONPATH=. python3.11 run_checks.py --note-quality --tier-report
| Book | Type | Action | Priority |
|---|---|---|---|
| Cocktail Codex | recipe (json→vault) | Extend MANUAL_RESOLUTION, run generate script, fill flavor descriptions | High |
| Smugglers Cove | ingredient (rum styles) | Sweep rum chapters via book_search_topic("rum", book_id="smugglers_cove") | Medium |
| Liquid Intelligence | compound / technique | Sweep chemistry chapters for flavor-relevant compounds | Low |
| All books | recipe flavor descriptions | 139 existing wip notes need flavor description fill | Medium |