Strict feedback loop for debugging E2E test failures. Never guess - always verify via direct observation.
You are debugging E2E test failures. Your ONLY method is the feedback loop: OBSERVE → HYPOTHESIZE → VERIFY → REPEAT.
NEVER GUESS. ALWAYS VERIFY.
If you haven't OBSERVED it with your own tools, you don't know it. Assumptions kill debugging sessions.
For app-shell E2E tests, these endpoints are guarded and must be mocked with the E2E marker header:
/api/user/api/auth/organizations/api/auth/workspaces/api/auth/all-workspacesUse the shared helper:
import { buildJsonMockResponse } from "./lib/strict-api-guard"
await page.route("**/api/user**", route =>
route.fulfill(
buildJsonMockResponse({
user: { /* ... */ },
}),
),
)
Do not use raw route.fulfill({ body: JSON.stringify(...) }) for guarded endpoints. Missing x-e2e-mock: 1 is a hard failure.
In CI, disabling this guard (E2E_STRICT_API_GUARD=0) is forbidden.
Run app-shell E2E against staging, not production.
bun run test:e2e (or scripts/playwright/run-e2e.sh gate)ENV_FILE=.env.production for E2E.DO THIS FIRST. NO EXCEPTIONS.
# 1. Get the EXACT error from the test output
tail -200 /tmp/staging-deploy.log | grep -A 30 "FAIL\|Error\|✘"
# 2. Read the error context file (contains page snapshot)
cat apps/web/test-results/{test-name}/error-context.md
# 3. Read the screenshot (if visual)
# Use Read tool on the PNG file path from error output
CAPTURE THESE FACTS:
STOP. Write down what you OBSERVED before proceeding.
Before blaming the test, verify the REAL system works.
# Login and get a session cookie
curl -c /tmp/cookies.txt -X POST https://staging.terminal.goalive.nl/api/login \
-H "Content-Type: application/json" \
-d '{"email":"[email protected]","password":"password"}'
# Test the actual API the test is calling
curl -b /tmp/cookies.txt https://staging.terminal.goalive.nl/api/auth/me
# Test the specific API endpoint
curl -b /tmp/cookies.txt -X POST https://staging.terminal.goalive.nl/api/claude/stream \
-H "Content-Type: application/json" \
-d '{"message":"test","workspace":"test.com"}'
If the API works: The problem is in test setup, not the application. If the API fails: The problem is in the application.
STOP. Record whether the real system works before proceeding.
Only now do you read code. Follow this exact sequence:
# Read the test
Read apps/web/e2e-tests/{test-file}.spec.ts
# Read the fixture
Read apps/web/e2e-tests/fixtures.ts
# Search for where the expected element is rendered
Grep "data-chat-ready\|data-testid" apps/web/app/
# Find the condition that controls rendering
Read {component-file} (look for the condition)
Write your hypothesis in this EXACT format:
HYPOTHESIS: [One sentence]
EVIDENCE:
1. [Observed fact 1]
2. [Observed fact 2]
3. [Observed fact 3]
PREDICTION: If this hypothesis is correct, then [specific testable prediction]
VERIFICATION METHOD: I will [exact steps to verify]
If you cannot fill in all fields, you don't have enough information. Go back to Phase 1.
DO NOT IMPLEMENT A FIX YET.
First, verify your hypothesis is correct:
# Add console.log/debug output
# Run a minimal reproduction
# Check the specific state you hypothesize is wrong
Questions to answer:
Only after verification, implement the fix:
# Deploy with E2E tests
nohup make staging > /tmp/staging-deploy.log 2>&1 &
# Watch for the specific test
tail -f /tmp/staging-deploy.log | grep -E "tab-isolation|protection-verification|PASS|FAIL"
Not done until you see the test pass.
Check the final output:
tail -50 /tmp/staging-deploy.log
If still failing: Return to Phase 1 with new information. If passing: Document what you learned.
OBSERVE: What selector? What's in the page snapshot?
Common causes:
Verification:
# Check page snapshot for the element
cat apps/web/test-results/{test}/error-context.md | grep -A 5 "button\|input\|data-"
OBSERVE: What condition enables the button?
Trace:
// Find the disabled condition
Grep "disabled.*=" apps/web/app/chat/
// -> Usually: disabled={!isChatReady || busy || ...}
// Find what isChatReady depends on
Grep "isChatReady\s*=" apps/web/
// -> Usually: isChatReady = condition1 && condition2
// Check each condition
OBSERVE: Workspace should be pre-set but isn't.
Root causes (in order of likelihood):
Verification:
# Check if all-workspaces API is mocked
Grep "all-workspaces" apps/web/e2e-tests/fixtures.ts
# Check localStorage injection
Grep "localStorage\|addInitScript" apps/web/e2e-tests/fixtures.ts
OBSERVE: What's the exact error code?
Verification:
# Check cookie setup in fixture
Grep "addCookies\|COOKIE_NAMES" apps/web/e2e-tests/fixtures.ts
# Test with curl using same auth
curl -b /tmp/cookies.txt {endpoint}
If you say "maybe" or "might be" or "probably", STOP. You're guessing. Go observe.
Never make two changes to "see which one works". Scientific method: one variable at a time.
Before any fix, verify you can reproduce the failure. After any fix, verify the test passes.
Every fix must include:
The error context file (error-context.md) contains a YAML page snapshot. This is your most valuable debugging information. READ IT BEFORE ANYTHING ELSE.
If the snapshot shows "Select a site above", the workspace is null. Period. Don't argue with the snapshot.
If the real API works (via curl), the problem is test setup. If the API fails, the problem is the application.
90% of E2E failures are state initialization issues:
When debugging, ALWAYS output in this format:
## OBSERVATION
[What I actually saw in the error/screenshot/page snapshot]
## HYPOTHESIS
[My theory about what's wrong]
## VERIFICATION
[How I will verify this hypothesis]
## RESULT
[What happened when I verified]
## ACTION
[What I will do next based on the result]
The feedback loop is not optional. It's not a "nice to have". It's the ONLY way to debug effectively.
OBSERVE → HYPOTHESIZE → VERIFY → REPEAT
Every shortcut you take adds hours to debugging time. Trust the process.