Reference and runner for llm-summary CLI commands. Use when the user asks to check stats, run summaries, look up functions, import call graphs, or do anything with a func-scans/<project>/functions.db.
Always activate the venv first:
source ~/project/llm-summary/venv/bin/activate
DB paths follow the pattern: func-scans/<project>/functions.db
llm-summary stats --db func-scans/<project>/functions.db
Shows counts for functions, call_edges, allocation/free/init/memsafe/verification summaries. Run this first to understand what's in the DB before doing anything else.
llm-summary summarize --db func-scans/<project>/functions.db \
--type allocation --type free --type init \
--backend gemini -v
llm-summary summarize --db func-scans/<project>/functions.db \
--type memsafe \
--backend gemini -v
--type: allocation, free, init, memsafe, verify (can repeat)--backend: claude, gemini, openai, ollama, llamacpp--model TEXT: override model name--llm-host / --llm-port: for local backends (ollama: 11434, llamacpp: 8080)--disable-thinking: disable thinking mode for llamacpp-f / --force: re-summarize even if cached--allocator-file PATH: JSON file with custom allocator names--deallocator-file PATH: JSON file with custom deallocator names--init-stdlib: pre-populate stdlib summaries before starting--log-llm PATH: log all LLM prompts/responses to filecall_edges > 0 — run import-callgraph first if neededallocation + free + init before memsafe (two separate invocations)llm-summary lookup <function_name> --db func-scans/<project>/functions.db
llm-summary lookup <function_name> --db func-scans/<project>/functions.db --signature <sig>
llm-summary show --db func-scans/<project>/functions.db
llm-summary show --db func-scans/<project>/functions.db --name <function_name>
llm-summary show --db func-scans/<project>/functions.db --file <path_fragment>
llm-summary show --db func-scans/<project>/functions.db --allocating-only
llm-summary show --db func-scans/<project>/functions.db --format json
llm-summary import-callgraph --json func-scans/<project>/callgraph.json \
--db func-scans/<project>/functions.db --clear-edges -v
--json PATH: path to KAMain callgraph JSON (required)--clear-edges: clear existing call_edges before import-v: verbose output showing match statsllm-summary init-stdlib --db func-scans/<project>/functions.db
# Auto-resolve deps from headers recorded during scan
llm-summary import-dep \
--db func-scans/<project>/<target>/functions.db \
--scan-dir func-scans/ -v
# Explicit dep DB
llm-summary import-dep \
--db func-scans/<project>/<target>/functions.db \
--from func-scans/<dep_project>/<dep_target>/functions.db -v
# With link-units (records resolved dep_dbs in link_units.json)
llm-summary import-dep \
--db func-scans/<project>/<target>/functions.db \
--scan-dir func-scans/ \
--link-units func-scans/<project>/link_units.json \
--target <target_name> -v
decl_header (recorded during scan via clang -E) to identify which library each extern belongs to~/.llm-summary/stdlib_cache.db (dep_headers table)--dry-run: show what would be imported without writing-f / --force: overwrite existing summariesllm-summary import-dep-summaries \
--db func-scans/<project>/<target>/functions.db \
--dep-db func-scans/<project>/<dep_target>/functions.db
--dep-db: repeat for multiple deps-f / --force: overwrite existing summariesmodel_used=dep:<dep_db_stem>--forceimport-dep for cross-projectllm-summary export --db func-scans/<project>/functions.db -o output.json
llm-summary callgraph --db func-scans/<project>/functions.db
llm-summary callgraph --db func-scans/<project>/functions.db --format csv -o graph.csv
llm-summary callgraph --db func-scans/<project>/functions.db --format json -o graph.json
--format: tuples (default), csv, json--no-header: omit header row (for csv/tuples)# Full project (legacy / no link-unit scoping)
llm-summary scan \
--compile-commands build-scripts/<project>/compile_commands.json \
--db func-scans/<project>/functions.db -v
# Per link-unit (preferred when link_units.json exists)
llm-summary scan \
--compile-commands build-scripts/<project>/compile_commands.json \
--link-units func-scans/<project>/link_units.json \
--target <target_name> \
--db func-scans/<project>/<target_name>/functions.db
# With preprocessing (extracts extern declaration headers)
llm-summary scan \
--compile-commands build-scripts/<project>/compile_commands.json \
--db func-scans/<project>/functions.db \
--preprocess -v
--compile-commands PATH: path to compile_commands.json (required)--link-units PATH: path to link_units.json (from discover-link-units); restricts scan to named target--target TEXT: link-unit target name (required with --link-units)--preprocess: run clang -E to extract extern declaration headers (decl_header column); needed for import-dep--project-path PATH: host source root — required when compile_commands.json has Docker container paths (/workspace/src/); use the raw build-artifacts copy in that case. The build-scripts/<project>/compile_commands.json produced by build-learn already has host paths and does not need this flag.Prefer build-scripts/<project>/compile_commands.json (host paths, produced by build-learn).
Use /data/csong/build-artifacts/<project>/compile_commands.json only when the build-scripts copy is missing, and add --project-path /data/csong/opensource/<project>.
llm-summary discover-link-units \
--build-dir /data/csong/build-artifacts/<project> \
--project-name <project> \
--output func-scans/<project>/link_units.json
Produces link_units.json with per-target bc_files lists used by scan --link-units.
llm-summary build-learn \
--project-path /data/csong/opensource/<project> \
--build-dir /data/csong/build-artifacts/<project> \
--backend claude \
--model claude-sonnet-4-5@20250929 \
-v
build-scripts/<project>/build.sh--enable-lto / --no-lto: control LLVM LTO (default: on)--prefer-static / --no-static: prefer static linking (default: on)--generate-ir / --no-ir: emit LLVM IR artifacts (default: on)--max-retries INT: max build attempts (default: 3)--container-image TEXT: Docker image (default: llm-summary-builder:latest)--no-ccache: disable ccache--ccache-dir PATH: host ccache directory (default: ~/.cache/llm-summary-ccache)--log-llm PATH: log all LLM prompts/responses to filellm-summary show-issues --db func-scans/<project>/functions.db
llm-summary show-issues --db func-scans/<project>/functions.db --status pending
llm-summary show-issues --db func-scans/<project>/functions.db --status false_positive
llm-summary show-issues --db func-scans/<project>/functions.db --severity high --status pending
llm-summary show-issues --db func-scans/<project>/functions.db --name <function_name>
llm-summary show-issues --db func-scans/<project>/functions.db --format json
--status: pending, confirmed, false_positive, wontfix--severity: high, medium, low--name TEXT: filter by function name (substring match)--format: table (default), jsonllm-summary review-issue <function_name> <issue_index> \
--status false_positive \
--reason "short-circuit eval guards this" \
--db func-scans/<project>/functions.db
<issue_index>: 0-based index into the function's issues[] array--status: pending, confirmed, false_positive, wontfix (required)--reason TEXT: reviewer explanation--signature TEXT: disambiguate when multiple functions share the same namellm-summary clear --db func-scans/<project>/functions.db
| Script | Purpose |
|---|---|
batch_scan_targets.py | Scan all projects; link-unit-aware when link_units.json present |
batch_call_graph_gen.py | Run KAMain + import call graphs; writes db_path/callgraph_json/cflcg/vsnapshot back to link_units.json |
batch_summarize.py | Run allocation+free+init+memsafe passes; link-unit-aware when link_units.json present |
batch_verify.py | Run verify pass; link-unit-aware when link_units.json present |
batch_rebuild.py | Rebuild projects with debug info |
All batch scripts support: --filter <name>, --tier <1|2|3>, --skip-list, --verbose
batch_summarize.py and batch_verify.py also support: --success-list
batch_call_graph_gen.py also supports: --compositional / --no-compositional
-save-temps=obj, generate build.shfunc-scans/<project>/functions.dballocation+free+init, Pass 2: memsafebuild-scripts/<project>/compile_commands.json (host paths)func-scans/<project>/link_units.jsonlink_units.json; scans each target into func-scans/<project>/<target>/functions.db--compositional — per-target two-phase CFL; writes db_path, callgraph_json, cflcg, vsnapshot back into link_units.json--init-stdlib — per target in topo order:
import-dep from cross-project dep DBs (resolved via decl_header)import-dep-summaries from intra-project dep DBsinit-stdliballocation+free+init, Pass 2: memsafeKey files per link unit (recorded in link_units.json after steps 4-5):
db_path — functions DBcallgraph_json — KAMain call graphcflcg — compressed CFL constraint graphvsnapshot — serialized V-relation (value aliasing)dep_dbs — list of resolved cross-project dependency DB paths (from import-dep)