Generates end-to-end tests from scenario-set.md. Web target → Playwright; mobile target → Detox. Every test imports a Page Object (never touches raw selectors), uses a named auth strategy, waits on observable conditions (never sleeps), and preserves scenario ids as trace anchors. Gate contract — zero raw selectors in the test body, zero sleep-based waits, zero xpath selectors. PIPELINE-3 step 2.
An L2 Truth-Execution skill. Its job is to turn scenarios into runnable, non-flaky end-to-end tests. The skill's main enemy is not "writing more tests" — it's writing tests that silently lie: pass on your machine, fail in CI, and waste hours to triage. The three rules below are non-negotiable because each one is a common root cause of e2e flake.
test-strategy-planner has
produced scenario-set.md and the target platform is declared in
vibeflow.config.json./vibeflow:e2e-test-writer <scenario-glob>.| Input | Required | Notes |
|---|---|---|
scenario-set.md | yes | Output of test-strategy-planner. Only scenarios tagged for the declared platform become candidates. |
| Platform | yes | web / / . Drives which section loads (Playwright for , Detox for /). |
iosandroidplatform-recipes.mdwebiosandroid| Target location | yes | web → base URL; ios/android → bundle identifier. Never hardcoded — read from arg or from vibeflow.config.json.targetLocations. |
| Auth strategy catalog | derived | references/pom-patterns.md defines the 4 named strategies. The scenario names the one it needs. |
| Existing Page Objects | scanned | Imported when they match the scenario's target screen. Hand-written POMs are never rewritten — this skill only reads them. |
repo-fingerprint.json | optional | Used to confirm the project's actual e2e runner matches the platform's expected tool. |
Hard preconditions — refuse with a single blocks-merge finding rather than shipping flaky tests:
^[a-zA-Z][\w.-]*$. Tests pinned at
localhost without an explicit port block — localhost without a
port is the #1 cause of "works on my laptop" CI failures.Read ./references/platform-recipes.md and pick the section matching
the declared platform:
web → Playwrightios → Detox (iOS runner)android → Detox (Android runner)Record the chosen runner + the exact import block the generated file
must use. Deviating from the recipe is forbidden — if a real project
needs a different runner, add a new section to platform-recipes.md
first.
Walk scenario-set.md. Keep a scenario when:
platform field includes the current platform (or equals all)coverage field is e2e or gap (other coverage tiers are owned
by component-test-writer / contract-test-writer)status is not deferredEvery surviving scenario becomes a candidate SpecCase:
interface SpecCase {
scenarioId: string; // SC-xxx
title: string; // human-readable
targetScreen: string; // named screen / route / deep link
authStrategy: "anonymous" | "stored-session" | "token-injection" | "ui-login";
preconditions: readonly string[];
steps: readonly string[];
expected: string; // the binary outcome to assert
priority: "P0" | "P1" | "P2" | "P3";
}
If a scenario is ambiguous (no target screen, no expected outcome),
mark it pending: "awaiting scenario refinement" and emit it as
test.skip(...) — never synthesize the missing pieces.
For each scenario's targetScreen, look up an existing POM under
tests/e2e/pages/<Screen>Page.ts (or the project's equivalent path
from repo-fingerprint.json). If found → import and reuse. If
missing → emit a minimal skeleton using the template from
references/pom-patterns.md and record the emission in the run
report. Never invent selectors that don't exist in the POM.
Tests that reach raw selectors (page.click('#login')) are
rejected at Step 6 below. Every DOM / UI interaction must flow
through a POM method.
For every SpecCase, look up its authStrategy in the auth catalog
(references/pom-patterns.md → "Auth Strategy Catalog"). Every
strategy has:
If the scenario's strategy is not applicable to the declared platform, block with remediation "scenario SC-xxx uses a web-only auth strategy on a mobile target". No silent fallbacks — a wrong auth strategy is how tests quietly start authenticating as the wrong user.
Every generated spec uses observable waits only. The following are forbidden at generation time:
await page.waitForTimeout(<ms>) / await sleep(<ms>) / any
fixed-duration sleep. These are timing races disguised as tests.while (!x) ... without a bound.Allowed patterns:
await expect(pom.readyHeading).toBeVisible() — Playwright webawait pom.waitForReady() — every POM exposes a named waitawait waitFor(element, { timeout }) — Detox mobileIf a scenario's steps cannot be expressed without a fixed sleep, block and point at the scenario for refinement.
Every selector referenced by the emitted POM (or the test body, in the rare cases a test needs its own) must come from this priority:
data-testid (preferred — stable across design refactors)role + accessible name (e.g. getByRole('button', { name: 'Sign in' }))getByText('Welcome back'))If an existing POM uses CSS/xpath, the skill emits a WARNING in the run report and links the POM line; it does NOT rewrite the POM. Rewriting a human-owned file is out of scope.
Target path convention:
tests/e2e/<feature>.spec.tse2e/<feature>.e2e.ts (Detox's conventional layout)Every emitted file starts with the standard banner:
// @generated-by vibeflow:e2e-test-writer
// Regenerate with: /vibeflow:e2e-test-writer <scenario glob>
// Do NOT edit the @generated regions by hand — they will be overwritten.
Regions between // @generated-start and // @generated-end are
skill-owned; anything outside is human-owned and preserved verbatim
on re-run. Same regeneration safety contract as
component-test-writer / business-rule-validator.
Every test(...) title starts with the scenario id, and every body
ends with a trace: scenarios/SC-xxx comment so
traceability-engine can wire test → scenario → PRD.
tests/e2e/<feature>.spec.ts (Playwright example)// @generated-by vibeflow:e2e-test-writer
// @generated-start
import { test, expect } from "@playwright/test";
import { LoginPage } from "./pages/LoginPage";
import { DashboardPage } from "./pages/DashboardPage";
test.describe("SC-112: user sees dashboard after login", () => {
test.use({ storageState: "tests/e2e/fixtures/authed.json" }); // stored-session auth
test("SC-112: dashboard welcome headline appears", async ({ page }) => {
// Arrange
const dashboard = new DashboardPage(page);
// Act
await dashboard.goto();
// Assert
await expect(dashboard.welcomeHeading).toBeVisible();
// trace: scenarios/SC-112 — "user sees dashboard after login"
// why: guards PRD §2.4 (authed landing page)
});
});
// @generated-end
.vibeflow/reports/e2e-test-writer.md# E2E Test Writer — <ISO timestamp>
## Target
- Platform: <web|ios|android>
- Runner: <playwright|detox>
- Base URL / bundle id: <target>
## Scenarios consumed
- SC-112 / P0 / stored-session / → tests/e2e/dashboard.spec.ts
- SC-113 / P1 / token-injection / → tests/e2e/settings.spec.ts
- SC-114 / — / skipped: no auth strategy declared (blocker)
## Page Objects
- Reused: DashboardPage, LoginPage
- Emitted (new skeleton): SettingsPage — manual implementation required
## Warnings
- tests/e2e/pages/LoginPage.ts:42 uses a CSS class selector (`.btn-primary`) — not rewritten, flagged for human review
Zero raw selectors in the test body, zero sleep-based waits, zero xpath selectors. Those are the three regressions that reliably re-introduce flake. Any generated file that would violate them is rejected at Step 6 and the offending scenario is reported; no silent degradation.
Additional blockers:
criticalScenariosWithoutTests == 0 (every P0 scenario tagged
e2e must produce a test — test.skip doesn't count).ambiguousScenarios == 0 — scenarios missing target/expected
cannot be guessed.traceability-engine — consumes trace: comments to link
test → scenario → PRD.test-priority-engine — uses the generated file list to rank
affected tests.observability MCP (ob_collect_metrics, ob_track_flaky) —
ingests the test runner's output when these specs actually run.