Run the Discovery Engine pipeline to find new consumer companies. Use when the user asks to "find deals", "source companies", "run the pipeline", "discover startups", or "search for prospects" in CPG, health tech, travel, or marketplaces.
Transform CLI-based discovery into a guided, conversational workflow for sourcing consumer companies.
This skill activates when you want to:
Trigger phrases:
Simplest invocation:
User: "Find me some new deals"
The skill will guide you through 6 steps: Configuration → Collection → Processing → Review → Push → Health Check.
Ask which collectors to run and which sectors to focus on.
Collector Presets:
| Preset | Collectors | Duration | Best For |
|---|---|---|---|
| Fast | github, sec_edgar, companies_house | ~2 min | Quick daily scan |
| All | 16 collectors | ~10 min | Comprehensive weekly search |
| Custom | User selects | Varies | Specific signal types |
Validation Gate: User confirms configuration before proceeding.
Execute collectors and report results.
python run_pipeline.py collect --collectors <preset>
Output: Signals collected, duplicates found, collector status (✓/✗)
Decision Point:
Run verification gate and thesis filter on collected signals.
python run_pipeline.py process
Output:
Decision Point:
Display top qualified signals with confidence scores.
python run_pipeline.py pipeline qualified --limit 20
Output: Table with company name, canonical key, confidence, signal types, why now.
Decision Point:
Sync qualified signals to Notion CRM.
# Dry run preview
python run_pipeline.py pipeline push --dry-run
# Actual push (after confirmation)
python run_pipeline.py pipeline push --confirm
Output:
Run post-execution diagnostics and recommend next steps.
python run_pipeline.py health --json
Output:
See references/collector-guide.md for complete list of 16 collectors, API requirements, and signal strengths.
Fast Preset Collectors:
If a collector fails due to missing credentials:
⚠ Missing API Key: GITHUB_TOKEN
Setup:
1. Generate token at: https://github.com/settings/tokens
2. Add to .env file: GITHUB_TOKEN=ghp_xxx
3. Restart skill
Alternative: Run other collectors without GitHub
If rate limited:
⚠ GitHub Rate Limit Exceeded
Resets at: 2026-01-31 14:30 UTC (in 42 minutes)
Options:
A) Wait 42 minutes, then retry
B) Run other collectors
C) Use authenticated token (increases limit)
If no signals pass verification:
ℹ No Qualified Signals
Results: 0 qualified, 12 held, 3 rejected
Next Steps:
A) Review held signals
B) Adjust thesis filters
C) Run different collectors
For complete troubleshooting guide, see references/troubleshooting.md.
The pipeline pushes to Notion with these fields:
See references/notion-schema.md for complete schema and routing logic.
User: "Find me some new deals" → Guided through Fast preset (2 min) → 12 qualified signals pushed to Notion
User: "Source consumer CPG companies" → Collectors filtered by CPG keywords → Thesis filter emphasizes food/beverage/beauty
| Scenario | Command Flow |
|---|---|
| Daily quick scan | Fast preset → Process → Push |
| Weekly deep dive | All preset → Review held → Push selected |
| Sector focus | Custom collectors → Filter by keywords |
| Debug low signals | Metrics → Health → Adjust collectors |
For pipeline architecture, stage-by-stage flow, and PipelineStats schema, see references/pipeline-architecture.md.
You'll know this skill is working when:
Metrics: