Diagnose and fix Work Order state inconsistencies
Diagnose and repair WO state inconsistencies: stale locks, orphaned worktrees, state drift.
ctx_wo_take.py executionManual operations on _ctx/ are FORBIDDEN:
# ❌ FORBIDDEN
rm _ctx/jobs/running/WO-XXXX.lock
# ❌ FORBIDDEN
mv _ctx/jobs/running/WO-XXXX.yaml _ctx/jobs/pending/
# ❌ FORBIDDEN
rm -rf _ctx/jobs/running/*
# ❌ FORBIDDEN
echo "..." > _ctx/jobs/running/WO-XXXX.lock
ctx_reconcile_state.pyctx_wo_finish.pygit worktree commands| Command | Purpose |
|---|---|
ctx_wo_take.py --status | System health overview |
git worktree list | Worktree inventory |
ctx_backlog_validate.py --strict | Validate all WOs |
ctx_reconcile_state.py --dry-run | Preview repairs |
Symptoms:
Error: Work order WO-XXXX is locked
Diagnosis:
# Check lock file
ls -la _ctx/jobs/running/WO-XXXX.lock
# Check lock age (DO NOT use hardcoded "1 hour")
# TTL is configured via WO_LOCK_TTL_SEC (default: 86400 = 24h)
Solution:
# 1. Preview repair
uv run python scripts/ctx_reconcile_state.py --dry-run
# 2. If safe, apply
uv run python scripts/ctx_reconcile_state.py --apply
Symptoms:
.worktrees/running/Diagnosis:
git worktree list
# Shows worktree but WO not in running/
Solution:
# 1. Check if worktree has uncommitted work
cd .worktrees/WO-XXXX
git status
# 2. If clean, remove worktree
git worktree remove .worktrees/WO-XXXX
# 3. If has work, either commit or stash first
Symptoms:
running but no worktreepending but has lockSolution:
# ALWAYS use reconcile script, never manual moves
uv run python scripts/ctx_reconcile_state.py --dry-run
# Review output carefully
uv run python scripts/ctx_reconcile_state.py --apply
Symptoms:
Error: YAML parse error / Schema validation failure
Solution:
# Check specific WO
uv run python scripts/ctx_wo_lint.py --wo-id WO-XXXX --json
# Fix YAML issues
# Common: missing required fields, wrong types
# Format to canonical
uv run python scripts/ctx_wo_fmt.py --write
Symptoms:
Error: Unknown epic_id E-XXXX
Solution:
# Check if epic exists
grep "id: E-XXXX" _ctx/backlog/backlog.yaml
# If missing, add epic to backlog.yaml
# Or update WO to use existing epic
Symptoms:
apply refused: WO_INVALID_SCHEMA
Root Cause: WOs or DoDs have invalid schemas that prevent reconcile from running.
Diagnosis:
# Identify schema issues
uv run python scripts/ctx_backlog_validate.py 2>&1 | head -30
Common Schema Issues:
DoD con campo items (debe tener campos requeridos):
# ❌ Inválido
dod:
- id: XXX
items: [...]
# ✅ Válido
dod:
- id: XXX
title: "..."
required_artifacts: [...]
required_checks:
- name: "check"
commands: [...]
rules: [...]
WO con required_flow incompleto:
# ❌ Inválido
required_flow:
- verify
# ✅ Válido
required_flow:
- session.append:intent
- ctx.sync
- ctx.search
- ctx.get
- session.append:result
Solution:
# 1. Fix DoD schemas first
# Edit _ctx/dod/*.yaml to have required fields
# 2. Fix WO required_flow
# Use script to batch fix:
python3 -c "
import re
from pathlib import Path
old = r'required_flow:\n - verify\n segment: \.'
new = '''required_flow:
- session.append:intent
- ctx.sync
- ctx.search
- ctx.get
- session.append:result
segment: .'''
for f in Path('_ctx/jobs/done').glob('WO-*.yaml'):
content = f.read_text()
if 'required_flow:\n - verify' in content:
f.write_text(re.sub(old, new, content))
print(f'Fixed: {f}')
"
# 3. Retry reconcile
uv run python scripts/ctx_reconcile_state.py --apply
Note: Modifying WOs in done/ requires bypass (see wo/finish bypass options).
Symptoms:
scope.allowscope.deny were changedSolution:
# Check what was modified
git diff --name-only origin/main...HEAD
# Compare with scope
grep -A 10 "scope:" _ctx/jobs/running/WO-XXXX.yaml
# Either:
# 1. Revert out-of-scope changes
git checkout origin/main -- <out-of-scope-file>
# 2. Or update scope (requires review)
# Edit WO YAML, add files to scope.allow
DO NOT use hardcoded "1 hour" rule.
The TTL is configurable:
WO_LOCK_TTL_SEC (default: 86400 = 24h)scripts/helpers.py → check_lock_age()# Check lock age via Python
uv run python -c "
from scripts.helpers import check_lock_age
from pathlib import Path
result = check_lock_age(Path('_ctx/jobs/running/WO-XXXX.lock'))
print(f'Lock valid: {result}')
"
--force Usage Conditions--force flags (e.g., git worktree remove --force) are ONLY allowed if:
git status inside the worktree is clean OR commit/stash was donectx_reconcile_state.py --dry-run explains the plan# Example: Force remove worktree
cd .worktrees/WO-XXXX
git status # Must be clean OR committed/stashed
uv run python scripts/ctx_reconcile_state.py --dry-run # Review plan
git worktree remove --force .worktrees/WO-XXXX
trifecta session append --segment . \
--summary "Forced remove worktree WO-XXXX because <reason>"
If ctx_wo_take.py crashed mid-execution:
# 1. Check state
uv run python scripts/ctx_wo_take.py --status
# 2. Run reconcile
uv run python scripts/ctx_reconcile_state.py --dry-run
uv run python scripts/ctx_reconcile_state.py --apply
# 3. Retry take
uv run python scripts/ctx_wo_take.py WO-XXXX
# 1. Prune stale references
git worktree prune
# 2. Check remaining worktrees
git worktree list
# 3. Remove corrupted worktree (only if conditions met!)
# See --force usage conditions above
git worktree remove --force .worktrees/WO-XXXX
# 4. Re-run reconcile
uv run python scripts/ctx_reconcile_state.py --apply
# 1. Check if process is alive
cat _ctx/jobs/running/WO-XXXX.lock
# 2. If PID is dead, safe to reconcile
# But USE RECONCILE, not manual rm
uv run python scripts/ctx_reconcile_state.py --apply
scripts/ctx_reconcile_state.py - State reconciliationscripts/helpers.py - Lock age validationscripts/ctx_wo_lint.py - YAML validationdocs/backlog/TROUBLESHOOTING.md - More troubleshooting# System status
uv run python scripts/ctx_wo_take.py --status
# Preview repairs
uv run python scripts/ctx_reconcile_state.py --dry-run
# Apply repairs
uv run python scripts/ctx_reconcile_state.py --apply
# Check worktrees
git worktree list
git worktree prune
# Validate all WOs
uv run python scripts/ctx_backlog_validate.py --strict
# NEVER: rm/mv on _ctx/ files
After repair completes:
PROBLEM → FIX APPLIED → RESULT (table)
STATE NOW: X pending, Y running, Z done, W failed
ACTIVE_WO=<current or none>
NEXT: /wo-start to resume work OR /wo-finish to close