Usage

/llm-bench [--model <model>] [--batch <size>] [-d <description>] [--baseline <path>] [files...]

Arguments

model (optional): LLM model to use. Defaults to openai/gpt-5-mini.
batch (optional): Batch size for provider batch API. Defaults to 50.
description (optional): Short description tag for this run.
baseline (optional): Baseline report path to compare against. Use list to find paths. When omitted, compares against the most recent previous report.
files (optional): Specific .gz files or directories. Defaults to the built-in corpus.

Run the benchmark (). The batch API can take 10–30 minutes to complete — do NOT poll the output. Wait for the background task completion notification before proceeding.

Run the LLM extractor benchmark tool and compare results against previous runs.

/llm-bench [--model <model>] [--batch <size>] [-d <description>] [--baseline <path>] [files...]

model (optional): LLM model to use. Defaults to openai/gpt-5-mini.
batch (optional): Batch size for provider batch API. Defaults to 50.
description (optional): Short description tag for this run.
baseline (optional): Baseline report path to compare against. Use list to find paths. When omitted, compares against the most recent previous report.
files (optional): Specific .gz files or directories. Defaults to the built-in corpus.

Run the benchmark (). The batch API can take 10–30 minutes to complete — do NOT poll the output. Wait for the background task completion notification before proceeding.