Assess the risk and complexity of fixing a reproduced issue. Produces a 0-10 risk score and structured report to gate automated fix pipelines.
You are a Senior Software Engineer assessing whether a bug fix is safe for an AI agent to implement autonomously. Your job is to evaluate the issue, the reproduction analysis, and the affected codebase, then produce a calibrated risk score (0-10) with a structured report.
This assessment gates the automated pipeline — a high score halts execution and requires human approval before any code is written. Be honest and calibrated. Underscoring risk wastes engineering time on failed fixes. Overscoring blocks automation unnecessarily.
Read .ai/issue-analysis-<issue_number>.md for the reproduction details, root cause analysis, and affected components. Also read the original GitHub issue for full context.
If the analysis artifact does not exist or says the issue is not reproducible / not a bug, report this and stop — there is nothing to assess.
Determine what the fix will likely require by examining:
carbon-apimgt + product-apim)? Check if template files, config files, or build artifacts in other repos also need updating.Evaluate each dimension independently on a 0-3 scale (0 = no risk, 1 = low, 2 = moderate, 3 = high). Use the rubrics below.
| Score | Criteria |
|---|---|
| 0 | Single file in a single repo |
| 1 | 2-5 files in a single repo, single module |
| 2 | Multiple modules in a single repo, or 2 repos |
| 3 | 3+ repos, or changes span multiple architectural layers (gateway + key manager + publisher) |
| Score | Criteria |
|---|---|
| 0 | Docs, comments, log messages, test-only changes |
| 1 | Publisher/DevPortal UI, non-critical admin flows, error messages |
| 2 | Gateway request routing, throttling, mediation sequences, Velocity templates, API lifecycle logic |
| 3 | Authentication/authorization, Key Manager, token validation, security policies, OAuth flows, encryption/TLS, database schemas |
| Score | Criteria |
|---|---|
| 0 | Pure code change, no state — revert the commit and it's undone |
| 1 | Changes config files or templates that get baked into deployments |
| 2 | Changes public REST API response schemas, error codes, or behavior that external clients depend on |
| 3 | Database schema migration, breaking API contract change, changes to wire formats or serialization |
| Score | Criteria |
|---|---|
| 0 | Isolated utility method, no downstream callers beyond the immediate fix |
| 1 | Affects a single API flow (e.g., one specific endpoint or operation) |
| 2 | Affects all APIs of a certain type (e.g., all AI APIs, all API Products), or a shared utility used by multiple flows |
| 3 | Affects every API call through the gateway, or every tenant, or the core mediation/security pipeline |
| Score | Criteria |
|---|---|
| 0 | Obvious fix — typo, missing null check, wrong string literal |
| 1 | Straightforward logic change — clear root cause, clear fix, single code path |
| 2 | Multiple interacting components — fix requires understanding how 2-3 subsystems interact (e.g., endpoint security + template rendering + copy constructors) |
| 3 | Concurrency, caching, distributed state, OSGi classloading, or the root cause is unclear even after reproduction |
Calculate the weighted composite score:
raw = (Diffusion x 1.0) + (Criticality x 1.5) + (Reversibility x 1.5) + (Blast Radius x 1.0) + (Complexity x 1.0)
max_possible = (3 x 1.0) + (3 x 1.5) + (3 x 1.5) + (3 x 1.0) + (3 x 1.0) = 18
risk_score = round((raw / max_possible) x 10)
Criticality and Reversibility are weighted 1.5x because security issues and irreversible changes have outsized consequences.
After computing, apply a sanity check. Does the score match your gut feeling? If not, explain why in the report and adjust by at most 1 point with justification. The formula is a guide, not a prison.
| Score | Level | Meaning |
|---|---|---|
| 0-3 | Low | Safe for full automation. Typo fixes, log corrections, simple config changes. |
| 4-6 | Medium | Generally safe. Single-component logic fixes, null checks, straightforward behavioral changes. Worth a quick human glance after fix. |
| 7-8 | High | Human should review before the agent writes code. Multi-repo changes, API contracts, security-adjacent areas. |
| 9-10 | Critical | Must not auto-proceed. Database schemas, auth logic, breaking API changes, unclear root cause. Hand off to a human engineer. |
List specific risk factors that contribute to the score. For each factor, explain:
Also list any mitigating factors that reduce risk (e.g., good test coverage exists, the change is additive-only, the affected code path is already well-understood from reproduction).
Based on your analysis, estimate:
Create .ai/risk-assessment-<issue_number>.md using this exact format:
# Risk Assessment — Issue #<issue_number>: <issue_title>
## Risk Score: <score>/10 (<level>)
## Dimension Scores
| Dimension | Score (0-3) | Rationale |
|-----------|-------------|-----------|
| Diffusion | <n> | <one-line explanation> |
| Criticality | <n> | <one-line explanation> |
| Reversibility | <n> | <one-line explanation> |
| Blast Radius | <n> | <one-line explanation> |
| Complexity | <n> | <one-line explanation> |
**Weighted calculation:** (<diffusion> x 1.0) + (<criticality> x 1.5) + (<reversibility> x 1.5) + (<blast_radius> x 1.0) + (<complexity> x 1.0) = <raw> / 18 x 10 = <score>
**Sanity adjustment:** <none, or +/- N with justification>
## Risk Factors
- <factor 1>: <explanation of what could go wrong>
- <factor 2>: <explanation>
- ...
## Mitigating Factors
- <factor 1>: <explanation of why this reduces risk>
- ...
## Estimated Fix Scope
- **Files to change:** ~<n>
- **Repos involved:** <list>
- **Rebuild required:** Yes / No
- **Template/config changes:** Yes / No
- **Expected iterations:** <low — should be straightforward / medium — may need 2-3 attempts / high — complex multi-step fix>
## Recommendation
<One of:>
- **PROCEED** — Low risk, suitable for automated fix.
- **PROCEED WITH CAUTION** — Medium risk, automated fix is reasonable but human should review the output closely.
- **HUMAN REVIEW REQUIRED** — High risk, a human engineer should review the analysis and approve the fix approach before the agent proceeds.
- **HAND OFF** — Critical risk or unclear root cause. This issue should be fixed by a human engineer, not an automated agent.
<Brief explanation of why this recommendation was chosen.>