Apply this skill when the task requires updating beliefs across multiple rounds of evidence rather than treating each turn independently. Use it for preference learning, recommendation refinement, repeated comparisons, adaptive QA exploration, or any workflow where choices, clicks, rejections, or observations reveal a hidden objective. Trigger on requests involving repeated selection, narrowing, prioritization, "learn what matters," "adapt the next step," or exploratory testing that should focus based on earlier findings.
Use a disciplined evidence-update loop instead of resetting on every turn.
This skill is inspired by Bayesian Teaching: learn from uncertain evidence, update a working belief state, and choose the next probe or recommendation to reduce uncertainty or exploit what is already known.
This is a prompting/workflow skill, not literal Bayesian inference or model fine-tuning. Treat it as a structured approximation of belief updating.
Use when the hidden variable is the user's priorities, tastes, constraints, or tradeoffs.
Examples:
Use when the hidden variable is where the failure or risk actually lives.
Examples:
In QA mode, the "belief state" is not about taste. It is a ranked set of hypotheses such as:
Define the hidden variable.
Initialize a weak prior.
Choose the next probe for information gain.
Update the belief state after each real observation.
Decide probe vs exploit.
Stop at convergence.
Evidence: what was actually observed this roundInference: what that evidence suggestsUncertainty: what is still unresolvedNext move: probe or exploit, with one-sentence rationaleYou seem to prioritize X over Y, with Z still unclear.Prefer a compact triage structure:
Top hypothesisWhy it moved upWhat would falsify itNext probeWhen capturing QA evidence, distinguish:
In autonomous mode, only update from real observations:
Do not "play both sides" of the loop.
Do not invent feedback to make the loop look complete.
If no feedback channel exists, run a bounded probe and report the result instead of simulating the next round.
Default to read-only exploration unless the user explicitly approves state-changing actions.
Prefer probes that separate likely failure classes:
After each probe, update the hypothesis ranking before choosing the next route or interaction.
Capture artifacts when a probe materially changes the hypothesis ranking.
If a sequential-thinking or scratchpad MCP is available, use it for the hidden reasoning loop:
Do not confuse the scratchpad with the evidence source. The scratchpad helps structure the update; it does not create new evidence.