Structured research pipeline: search sources, extract structured data, archive raw, deduplicate, update canonical trackers, backlink entities.

Contract

One skill for any email-to-structured-data pipeline. The only differences between tracking investor updates, expenses, and company metrics are the search queries, extraction schemas, and tracker page format. All three use the same 7-phase pipeline with parameterized recipes.

When to Use

User wants to track structured data from email, web, or API sources
User says "research", "track", "extract from email", "build a tracker"
User mentions investor updates, donations, company metrics, filings
User wants to set up recurring data collection (with cron recipe)

Phases

Phase 1: Define Research Recipe

Ask the user what they want to track. Either:

Pick a built-in recipe: investor-updates, expense-tracker, company-updates

Data Research

Contract

When to Use

Phases

Phase 1: Define Research Recipe

Data Research

Contract

When to Use

Phases

Phase 1: Define Research Recipe

Phase 2: Search Sources

Phase 3: Classify

Phase 4: Extract Structured Data

Phase 5: Archive Raw Sources

Phase 6: Deduplicate

Phase 7: Update Canonical Tracker + Backlink

Built-In Recipes

Anti-Patterns

Output Format

Conventions

Clickhouse Io

Clickhouse Io

Claude Devfleet

Clickhouse Io

Ai First Engineering

Postgres Patterns