Define what you're trying to measure by clarifying the decision it supports, what you'd observe if the quantity were high or low, and the threshold that would change your decision. Use when someone says "we should measure X" or "how do we know if Y is working" or when starting a new measurement project. Creates a project in .measure/projects/.
Measurement only matters in the context of a decision. This skill transforms a vague concern ("we need to measure employee engagement") into a precisely defined measurand tied to a specific decision, with observable consequences, a unit of measure, a decision threshold, and an estimate of the cost of being wrong.
The entry point is always "What decision are you trying to make?" — never "What do you want to measure?"
Ensure the .measure/ directory and project infrastructure exist:
.measure/ does not exist, create it..measure/projects/ does not exist, create it..measure/projects/index.json does not exist, create it with:
{"projects": []}
.measure/config.json does not exist, create it with:
{
"default_decomposition_depth": 3,
"default_simulation_iterations": 10000,
"default_confidence_level": 0.90,
"created_at": "<current UTC timestamp>"
}
Read .measure/projects/index.json. If projects exist, present a list:
Existing measurement projects:
- [name] — [decision] (status: [status])
Would you like to resume an existing project or start a new one?
If no projects exist, proceed directly to Step 3.
IMPORTANT: Ask exactly ONE question, then STOP and wait for the user's response before continuing. Do not batch questions.
Ask:
What decision are you trying to make?
(Not "what do you want to measure" — measurement only matters if it changes a decision.)
Wait for the user's response.
If the user leads with a measurand ("we need to measure employee engagement"), redirect:
That's a measurand, not a decision. What decision would measuring employee engagement inform? For example: "Should we invest in a new employee experience program?"
Wait for the user's response before continuing.
Walk through Hubbard's clarification sequence. Ask one question at a time. Wait for each answer before asking the next.
Question 1 — Context:
Tell me more about this decision. What's the context? Who are the stakeholders? What's the timeline?
Question 2 — Observable consequences:
If [the thing you're uncertain about] were significantly higher or lower than you expect, what would you see? What would change? What observable consequences would there be?
Question 3 — Detection:
How would you know if you saw it? What would you count, observe, or detect? What units could you express this in, even roughly?
Question 4 — Decision threshold:
At what level would your decision change? What's the threshold — if the number is above/below X, you'd decide differently?
If the user tries to skip this, explain why it matters:
The decision threshold is critical. Without it, we can't calculate whether further measurement is worth the cost (value of information). Even a rough threshold helps — you can refine it later.
Question 5 — Cost of being wrong:
What's the cost of making the wrong decision? Consider both directions:
- If you act when you shouldn't have (false positive), what do you lose?
- If you don't act when you should have (false negative), what do you lose?
Rough estimates are fine — order of magnitude is enough.
Question 6 — Current assumption:
What does the conventional wisdom or current assumption say? What do most people involved believe the answer is?
For deeper methodology on any of these questions, read
references/clarification-chain.md.
Before creating the project, present the structured definition back to the user for confirmation:
Here's what we've defined:
Decision: [decision] Context: [context] Stakeholders: [stakeholders]
What we're measuring: [measurand label] In plain language: [plain language description] Observable consequences: [list] Unit of measure: [unit] Current assumption: [assumption]
Decision threshold: [description] (value: [value] [unit]) Cost of being wrong:
- Acting when we shouldn't: $[upside_value]
- Not acting when we should: $[downside_value]
Does this look right? Would you like to change anything?
Wait for confirmation or adjustments before proceeding.
Construct a JSON payload with the confirmed data:
{
"decision": "...",
"decision_context": "...",
"stakeholders": ["..."],
"measurand_label": "...",
"measurand_plain_language": "...",
"observable_consequences": ["..."],
"unit_of_measure": "...",
"current_assumption": "...",
"threshold_description": "...",
"threshold_value": 0,
"threshold_unit": "...",
"cost_description": "...",
"upside_value": 0,
"downside_value": 0
}
Run the project creation script:
echo '<json_payload>' | uv run python .claude/skills/measurement-definition/scripts/create_project.py
Parse the JSON output.
If already_exists is true:
A project with a similar name already exists at [definition_path]. Would you like to:
- Resume the existing project
- Create a new project with a different name
If resuming, read and display the existing definition. If creating new, ask the user for a distinguishing name and retry.
After successful creation:
Project created: [name] Slug: [slug] Files:
- Definition: [definition_path]
- Status: [status_path]
Your project is now in the defined state.
Then suggest next steps:
What to do next:
- To break this down into measurable components, use the decomposition skill.
- To put uncertainty ranges on the components directly, use the prior-estimation skill.
- To run the full guided workflow, use the measure skill.
If the user says something like "we measure NPS" or "we track sprint velocity," probe deeper:
NPS is a metric you already collect, but what does it tell you that matters for this decision? What would it mean if NPS dropped 10 points? What would you do differently?
The goal is to get behind the metric to the underlying concern it proxies for.
If the user genuinely cannot provide a threshold, accept it but note the