Pick up an open lead from investigation.db and investigate it to completion
LAYER 1: RESEARCH AGENT — This is a fact-gathering skill. Document what you find. Do not theorize, speculate, or apply analytical frameworks. If you notice a pattern, record the raw data — pattern recognition is for Layer 2 analysis agents. Record mundane facts (officer names, addresses, formation dates, filing numbers) even when they don't seem interesting. Record negative results from every source checked.
Claim and investigate the next highest-priority open lead. Operates fully autonomously.
/pursue-lead 42 to pursue a specific leadLoad the active investigation context before executing:
uv run python tools/investigation_context.py show
This provides: primary_subject, key_persons, threads, corpus_tools, key_dates, known_addresses. Use these values instead of hardcoded names throughout this skill.
Document everything, not just what's relevant to your current hypothesis.
When you encounter information during investigation — officer names, addresses,
corporate relationships, financial figures, dates, professional affiliations —
record it even if it doesn't obviously connect to the current lead. Use
entity_tracker.py to register entities, roles, and addresses. Use
findings_tracker.py with --type background for contextual facts that don't
directly answer the current question but are worth preserving. These ambient
findings compound across investigations and surface connections later.
Create a unique working directory for this session:
WORKDIR=$(mktemp -d /tmp/osint-XXXXXXXX)
echo "Session workdir: $WORKDIR"
Use $WORKDIR/ instead of /tmp/ for ALL --output paths and report files throughout this session. This prevents parallel /pursue-lead instances from overwriting each other's files.
If no specific ID given, use claim-next to atomically select and claim in one step (prevents race conditions with parallel agents):
python tools/lead_tracker.py claim-next
To filter by category or thread:
python tools/lead_tracker.py claim-next --category person --thread-id 1
If a specific lead ID was given, claim it directly:
python tools/lead_tracker.py claim <ID>
Read the lead's description and category to determine the right approach:
/investigate-person workflow/trace-entity workflowBefore querying any source, check if the query was already run:
# In your search workflow, check:
from tools.lead_tracker import check_searched
prior = check_searched("rod-larsen", "doj_vol11")
if prior:
# Skip this source, already searched
Before searching, identify which sources are mandatory for this lead type. Do not skip sources because you "found enough" elsewhere. Check every mandatory source and record the result (including zero-result searches).
Person leads:
query_courtlistener.py search/party/cases)query_fec.py donor)query_990.py search — check if person is officer/director of any nonprofit)query_edgar.py search/lookup — insider filings, mentions in proxy statements)query_littlesis.py search — pre-mapped relationships)query_registry.py officers — what entities are they officer of?)query_fara.py search — foreign agent registrations)query_lobbying.py lobbyist)query_opensanctions.py search — PEP/sanctions check)Entity/corporate leads:
query_registry.py search — all jurisdictions)query_edgar.py search — filings mentioning entity)query_990.py search — if nonprofit)query_usaspending.py awards — federal contracts/grants)query_sam.py entity/exclusions — registration, debarments)query_courtlistener.py search — litigation involving entity)query_gleif.py search — LEI records, corporate hierarchy)query_lobbying.py client — lobbying by entity)query_fara.py search — foreign principal registrations)Financial leads:
parse_ds10_financials.py query)query_acris.py party)query_registry.py ucc-search)Run queries against relevant sources. For each search:
python tools/lead_tracker.py (use log_search function)For person/entity leads, supplement dataset searches with web research:
research/RELATED_INVESTIGATIONS.md for relevant historical parallelsresearch/OSINT_RESOURCES.md for specialized tools that might helpAdditional API tools (use --output to keep context lean):
# LittleSis (relationship/board mapping)
uv run python tools/query_littlesis.py search "<TARGET>" --output $WORKDIR/lead-littlesis.json
# SEC EDGAR (mentions in public filings)
uv run python tools/query_edgar.py search "<TARGET>" --size 10 --output $WORKDIR/lead-edgar.json
# Investigation reports (if populated)
uv run python tools/query_investigations.py search "<TARGET>" --limit 10 --output $WORKDIR/lead-investigations.json
This is especially important when:
For each confirmed discovery (all provenance fields required by hooks):
python tools/findings_tracker.py add \
--target "<TARGET_NAME>" \
--summary "One-line summary of what the evidence shows" \
--type communication \
--evidence <EVIDENCE_REF> \
--claim-type paraphrase \
--source-quote "<EVIDENCE_REF>:exact text from source supporting this claim" \
--sources <SOURCE_NAMES> \
--confidence high \
--date "2017-03-15" \
--lead-id <LEAD_ID>
Claim types (hooks enforce this):
direct_quote — verbatim from source (can be confirmed)paraphrase — agent summary of source (max high)inference — agent conclusion from evidence (max medium)synthesis — combined multiple sources (max medium)For each finding, note its narrative potential:
When completing a lead, identify the single most article-worthy finding — the one that would make a reader stop and think. Note it in the lead completion summary. This seeds future /write-article work.
If the finding reveals a relationship:
python tools/findings_tracker.py connect \
--person-a "<PERSON_A>" --person-b "<PERSON_B>" \
--type financial --strength strong \
--evidence <EVIDENCE_REF> \
--finding-id <FINDING_ID>
CRITICAL: Register entities in structured tables as you discover them. Use tools/entity_tracker.py instead of inline SQL snippets.
# 1) Lookup existing entities
uv run python tools/entity_tracker.py lookup --name "Entity Name"
# 2) Create entity if missing
uv run python tools/entity_tracker.py add-entity --name "Entity Name" --entity-type llc --jurisdiction ny --status active --source "EFTA02XXXXXX" --notes "Context"
# 3) Record person role
uv run python tools/entity_tracker.py add-role --entity-id <ENTITY_ID> --person-name "Person Name" --role "director" --date-start "2010-01" --date-end "2019-07" --source "EFTA02XXXXXX"
# 4) Record address
uv run python tools/entity_tracker.py add-address --entity-id <ENTITY_ID> --address "123 Main St, City, ST 00000" --address-type registered --date-observed "2019" --source "ny_sos"
# 5) Record entity relation
uv run python tools/entity_tracker.py add-relation --entity-a-id <ENTITY_A_ID> --entity-b-id <ENTITY_B_ID> --relation-type funds --description "Enhanced Education donated $150K to IPI" --source "EFTA02XXXXXX"
Use allowed entity types: llc, inc, ltd, trust, foundation, nonprofit, partnership, fund, association, government, unknown.
When employment history is discovered during investigation, record career arcs:
uv run python tools/pillar_tracker.py arc \
--person "<NAME>" --pillar "<INSTITUTION>" \
--role "<ROLE>" --seniority <junior|mid|senior|leadership|founder> \
--start "<YEAR>" --end "<YEAR>" \
--source "<EVIDENCE_REF>"
Check registered pillars: uv run python tools/pillar_tracker.py list --type banking (or legal, government, etc.)
When investigation reveals new threads worth pursuing:
python tools/lead_tracker.py add \
--title "Investigate Samantha Stein ProtonMail communications" \
--category person \
--priority high \
--source "agent:pursue-lead" \
--target "Samantha Rose Stein" \
--evidence EFTA02731082 \
--related <PARENT_LEAD_ID>
Agents should freely create follow-up leads at whatever priority they judge appropriate.
python tools/lead_tracker.py complete <ID> --findings "Summary of what was found and what remains unknown"
If the lead is a dead end:
python tools/lead_tracker.py dead-end <ID> "Explanation of why"
If blocked (e.g., Neo4j not running, API down):
python tools/lead_tracker.py block <ID> "Neo4j not available for ICIJ cross-reference"
Read research/INVESTIGATIVE_METHODOLOGY.md if you haven't already. You are an investigator, not a search engine.
Form hypotheses first. Read the lead's description and your training data knowledge. What do you expect to find? What would confirm this lead? What would refute it? Write your hypotheses as a note on the lead.
Simulate the person. If this lead involves a person, ask: What role does this target play in the investigation thread? What are their incentives? What would confirmation look like? What was their public position vs. private behavior? The gap is where the story lives.
Check the timeline. What else was happening in the world when this event occurred? A Dec 2016 email about a Russian ambassador means something very different than a Dec 2014 one.
Think about what's missing. If you find 5 emails between two people in 2017 but zero in 2018-2019, the gap may be more significant than the emails. Did they move to ProtonMail? Did the relationship end? Did they start using intermediaries?
thread_id, assign new findings to the same thread with --thread-id N.uv run python tools/infra_tracker.py add --title "..." --type new_source --description "..." --source-name "..." --priority medium --discovered-by "agent:pursue-lead". If the source has a free API and you can build the tool quickly, do it — probe the endpoint first, confirm it works, then write the tool and update CLAUDE.md.--type tool_improvement. Small enhancements compound across all future investigations.This skill is designed to work as a standalone command in its own CC instance. For wave execution, run multiple CC instances each running /pursue-lead:
Terminal 1: claude → /pursue-lead
Terminal 2: claude → /pursue-lead
Terminal 3: claude → /pursue-lead
All instances write to shared investigation.db (WAL mode handles concurrent writes).
--output $WORKDIR/... on ALL search commands to keep context leancat or Read full document text — extract relevant quotes onlyIf you encounter bugs in CLI tools (crashes, incorrect output, missing features), submit them to the infra queue:
uv run python tools/infra_tracker.py add --title "Bug: <description>" --type tool_improvement --priority high --description "<details including the error traceback>"
If spawned by another skill (e.g., wave orchestrator), write a report at completion:
# Write to $WORKDIR/report-lead-<LEAD_ID>.md
Report format:
---