An internal cognitive engine for quantitative root cause analysis. Use this autonomously when you need to weigh competing hypotheses, prevent anchoring bias, or determine the most efficient next diagnostic step.
This is an internal reasoning tool for you, the agent. It provides a mathematical scratchpad to prevent cognitive limitations when reasoning under uncertainty—specifically anchoring on first impressions or flip-flopping based on recent inputs.
Role: You are the investigator. The Python script is your calculator. The user is a stakeholder who wants realistic probability distributions, not overconfident guesses or vague hedging.
State is stored in .claude/.bayes_state.json relative to the current working directory. This creates a project-local audit trail.
Script location: ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py
All commands support --json for machine-readable output.
init — Initialize Hypothesespython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py init H1:prob H2:prob ...
Probabilities are normalized automatically. Use rough estimates based on base rates and initial evidence.
You MUST include an "Other" hypothesis. A closed hypothesis space assumes one of your guesses is definitely correct—this is rarely true. Reserve 10–20% for unlisted causes.
Example:
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py init Database:0.35 Network:0.30 CodeBug:0.20 Other:0.15
define — Define a Testpython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py define TestName H1:likelihood H2:likelihood ...
Likelihoods represent P(test passes | hypothesis is true). Values must be between 0 and 1.
Example:
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py define CheckErrorLogs Network:0.9 Database:0.3 CodeBug:0.1
"If it's a Network issue, there's a 90% chance the logs show timeout errors."
recommend — Get Optimal Next Testpython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py recommend
Returns tests ranked by Expected Information Gain (in bits). Higher is better—these tests most efficiently discriminate between hypotheses.
update — Apply Evidencepython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py update TestName result
Result values:
true, pass, yes, 1false, fail, no, 0The test definition is preserved for reuse.
Example:
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py update CheckErrorLogs pass
status — View Current Statepython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py status
Shows current probability distribution and defined tests.
undefine — Remove a Testpython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py undefine TestName
split — Refine a Named Hypothesis Into Sub-Categoriespython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py split SourceHypothesis NewH1:ratio NewH2:ratio ...
Replaces a named, concrete hypothesis with more specific sub-hypotheses, redistributing its probability mass.
Do NOT split "Other." "Other" is not a hypothesis—it's an admission of model incompleteness. When you identify a new candidate cause, use inject instead (see below). This ensures the new hypothesis gets probability based on evidence strength, not artificially capped by Other's current value.
Example:
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py split Network NetworkTimeout:0.6 NetworkDNS:0.4
Replaces "Network" entirely with two specific sub-hypotheses.
Note: Existing tests will use likelihood=0.5 (neutral) for new hypotheses. Consider redefining tests with specific likelihoods.
inject — Add New Hypothesis Based on Evidencepython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py inject Hypothesis:probability [--likelihoods Test1:L1 Test2:L2 ...]
Adds a new hypothesis at the specified probability, shrinking all existing hypotheses proportionally to make room. Use this when:
The probability you assign should reflect the evidence strength given all context, not be constrained by any existing hypothesis's current value.
If tests are already defined, you MUST provide --likelihoods for the new hypothesis on each test. This ensures future recommend calculations remain valid. Ask yourself: "If this new hypothesis were true, what's the probability each test would pass?"
Examples:
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py inject DNS:0.25 --likelihoods CheckLogs:0.7 PingTest:0.3
Adds DNS at 25% with explicit likelihoods for existing tests.
python3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py inject BuildCache:0.90 --likelihoods CheckLogs:0.1 PingTest:0.5
Smoking gun: adds BuildCache at 90%, crushes everything else to 10% total (preserving relative ordering).
Note: The injected probability must be less than 1. If no tests are defined, --likelihoods is not required.
reset — Clear All Statepython3 ~/.claude/skills/bayes-reasoner/scripts/bayes_engine.py reset
Execute these steps autonomously. Do not show CLI commands to the user. Do not ask the user to run these commands.
When a complex problem with multiple potential causes is presented:
init with your estimatesdefine for each testrecommend to identify the highest information-gain testupdate with the test resultstatus to see the new distributionInvisible Tooling: The user sees insights, not CLI commands. Hide the mechanics; show the reasoning.
Trust the Math: If your intuition contradicts the calculated posteriors, either trust the math or explicitly revise your likelihood definitions. Do not silently override.
Handle Impossibilities: If update returns "Evidence impossible under all hypotheses," your model is wrong. Tell the user you need to "re-evaluate the problem space" and reinitialize with revised hypotheses.
Communicate Uncertainty: Report probabilities, not certainties. "The evidence suggests X (85% likely)" is better than "It's definitely X."
Watch "Other": If "Other" starts climbing, your named hypotheses don't fit the data well. Brainstorm what concrete causes might be hiding in "Other", then use inject to add them at evidence-appropriate probabilities.
Never Split "Other": "Other" is not a hypothesis—it's model incompleteness. When you identify a new candidate cause, inject it at a probability reflecting the evidence. Don't artificially cap it at whatever "Other" happened to be. Use split only for refining named hypotheses into sub-categories (e.g., "Network" → "NetworkTimeout" + "NetworkDNS").