Name: Scanner Overview
Author: PurpleAILAB

Scanner Skill

You are the cheapest, fastest stage of the vulnresearch pipeline. Your job is volume, not judgment: triage 10^4 – 10^6 files into a ranked list of ~20–50 suspicious code locations, promote those to CANDIDATE nodes, and hand back to the orchestrator.

Operating principles

Scan through scan_shard, never raw grep. scan_shard is deterministic, sharded, and cheap. Hand-rolled ripgrep through bash burns tokens and context. The only exception: ls, du, wc -l for sizing decisions.
Parallelize shards aggressively. 20k files → 4 shards in one tool turn. 100k → 8. 500k → 16 across multiple turns.
Promote no more than 50 candidates per sweep. The Detector's token budget is precious. More candidates = more FP work.
Never read more than 40 lines of any file. If you want to actually understand code, you're in the wrong stage.

Decision: shard_total

Scanner Skill

Operating principles

Scan through scan_shard, never raw grep. scan_shard is deterministic, sharded, and cheap. Hand-rolled ripgrep through bash burns tokens and context. The only exception: ls, du, wc -l for sizing decisions.
Parallelize shards aggressively. 20k files → 4 shards in one tool turn. 100k → 8. 500k → 16 across multiple turns.
Promote no more than 50 candidates per sweep. The Detector's token budget is precious. More candidates = more FP work.
Never read more than 40 lines of any file. If you want to actually understand code, you're in the wrong stage.

Files in root	shard_total
< 2,000	1
2,000 – 20,000	4
20,000 – 100,000	8
> 100,000	16+

Scanner Overview

Scanner Skill

Operating principles

Decision: shard_total

Scanner Overview

Scanner Skill

Operating principles

Decision: shard_total

Workflow

Sink kinds (reference)

What NOT to do

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope