Process all unprocessed content in the notes directory - parses PDFs, processes Instagram reel URLs found in notes, classifies everything, and summarizes. Use when asked to process all notes or prepare all content.
source .venv/bin/activate 2>/dev/null || (python3 -m venv .venv && source .venv/bin/activate)
pip install -q -r requirements.txt
!ls -la notes/ 2>/dev/null || echo "notes/ directory is empty"
ls data/raw/*/parsed.json 2>/dev/null | wc -l | tr -d ' 'ls data/raw/*/classified.json 2>/dev/null | wc -l | tr -d ' 'ls data/raw/pdfs/*.json 2>/dev/null | wc -l | tr -d ' 'Process all unprocessed content in the directory:
notes/Find and parse any PDF files that haven't been processed yet:
source .venv/bin/activate && python scripts/parse_pdf.py notes/pdfs-from-friend/*.pdf
Search through all notes files for Instagram reel/post URLs:
grep -roh 'https://www.instagram.com/\(reels\?/\|p/\)[A-Za-z0-9_-]*/' notes/ | sort -u
For each Instagram URL found that hasn't been processed, run the full pipeline:
source .venv/bin/activate && python scripts/process_reel.py <urls>
For each reel that has a parsed.json but no classified.json:
source .venv/bin/activate && python scripts/classify_reel.py data/raw/<shortcode>/parsed.json
Read any plain text or markdown files in notes/ and extract useful content.
Produce a comprehensive summary: