Plant subtle multi-file bugs in a HOL lab app. Bugs span 2-3 files, are diagnosable from symptoms, have a clear 'aha' moment, and can be caught by a test.
You are planting bugs in a HOL lab app for a debugging exercise. Participants will use Claude Code to find and fix these bugs (Act 2, ~20 min). The bugs must be subtle enough to require real diagnosis but fair enough to have a satisfying "aha" moment.
$ARGUMENTS specifies the number of bugs to plant (default: 2). Typically 1-2 for a half-day lab.
Read ALL source files and map every data flow path through the app:
Router (receives request, validates via schema)
→ Service (business logic, queries)
→ Model (ORM, database operations)
→ Schema (serializes response)
For each flow, note:
Choose injection points that meet ALL criteria:
A stretch bug must be safe to leave unfixed if a participant runs out of time. It must NOT block Acts 3-4 — meaning the feature build exercise must still be completable with the stretch bug present. In practice this means the stretch bug should affect a different data flow or endpoint than the one used by the recommended feature.
Choose from these patterns (adapt to the specific codebase):
Where: Service layer applies wrong business logic based on a tier, category, or status field.
Example: Gold-tier guests get Silver-tier pricing because the service checks tier == "silver" instead of tier == "gold", or uses <= instead of < on a threshold.
Spans: services/*.py (root cause) + routers/*.py (wrong response) + models.py (tier definition)
Symptoms: Tests comparing expected vs actual pricing/discounts fail. API returns wrong values for specific tier.
Where: A database query uses the wrong comparison operator or misses a condition.
Example: Date range filter uses < instead of <=, excluding the last day. Or a filter applies to the wrong field.
Spans: services/*.py (root cause in query) + routers/*.py (missing results in response)
Symptoms: Search/filter endpoint returns fewer results than expected. Edge-case test fails.
Where: A Pydantic schema has a wrong default value that silently overrides user input or database values.
Example: discount_percent: float = 0.0 when it should be Optional[float] = None, causing all records to show 0% discount unless explicitly set.
Spans: schemas.py (root cause) + services/*.py (uses schema) + routers/*.py (serves wrong data)
Symptoms: API returns unexpected defaults. Tests checking specific field values fail.
Where: A computation in the service layer uses the wrong field, wrong order, or wrong aggregation.
Example: Sorting by created_at instead of departure_date, or summing base_price instead of total_price.
Spans: services/*.py (root cause) + routers/*.py (wrong order/total in response)
Symptoms: List endpoints return data in unexpected order. Summary endpoints show wrong totals.
Where: Code follows the wrong relationship or misses a join condition.
Example: Fetching a guest's bookings but filtering by guest_id on the wrong table, or missing an eager load causing empty nested data.
Spans: services/*.py or models.py (root cause) + routers/*.py (incomplete response)
Symptoms: Nested data missing or showing wrong related records.
For each bug:
uv run uvicorn main:app --reload should workuv run pytest and confirm the right tests fail with useful (not cryptic) error messagesCreate .hol/bugs/answers.md:
# Bug Answer Keys — {App Name}
FACILITATOR ONLY — Do not share with participants.
---
## Bug 1: {Short Name}
**Difficulty:** Standard
**Symptom:** {What the participant sees — which test fails, what API returns wrong}
**Root Cause:** {1-2 sentence explanation}
**Affected Files:**
- `{file1}` line {N} — {what's wrong here}
- `{file2}` line {N} — {how it manifests here}
**The Fix:**
```diff
--- a/{file1}
+++ b/{file1}
@@ ...
- {buggy line}
+ {fixed line}
Regression Test:
def test_{bug_name}_regression():
# This test would have caught the bug
...
Facilitator Hints (if participant is stuck):
Repeat for each bug. If planting 2 bugs, label the second as:
```markdown
## Bug 2: {Short Name} *(Stretch)*
**Difficulty:** Stretch
Mark stretch bugs clearly so facilitators can skip them for time-constrained or less-experienced audiences. The first bug should always be the primary one that fits within the 20-min Act 2 window.
uv run pytest # Some tests should fail — document which ones
uv run uvicorn main:app --reload # App should still start
Report to the user: