NOTE: Startup and cleanup are handled by worker-base. This skill defines the work procedure.

When to Use This Skill

Use for features that primarily change:

packages/sdk/tests/word-benchmark/**
live-review runner, reviewer, submission, mismatch, or issue-ledger code
bridge/session-targeting code directly required by the live-review path
supporting docs/artifact contracts only when required by the harness logic itself

Do not use for core runtime/hook/verifier execution-contract features unless they are strictly needed to keep live classifications truthful.

Required Skills

Test-Driven Development — invoke before code changes; add or update failing tests first.
Systematic Debugging — invoke if runner behavior, receipt classification, or session routing is unclear.
bridge-monitoring — invoke when a Hybrid session is available; use bridge-first checks and explicit session/document routing.

Work Procedure

NOTE: Startup and cleanup are handled by worker-base. This skill defines the work procedure.

When to Use This Skill

Use for features that primarily change:

packages/sdk/tests/word-benchmark/**
live-review runner, reviewer, submission, mismatch, or issue-ledger code
bridge/session-targeting code directly required by the live-review path
supporting docs/artifact contracts only when required by the harness logic itself

Do not use for core runtime/hook/verifier execution-contract features unless they are strictly needed to keep live classifications truthful.

Required Skills

Test-Driven Development — invoke before code changes; add or update failing tests first.
Systematic Debugging — invoke if runner behavior, receipt classification, or session routing is unclear.
bridge-monitoring — invoke when a Hybrid session is available; use bridge-first checks and explicit session/document routing.

Work Procedure

{ "salientSummary": "Hardened Hybrid live-review session routing and reviewer-only classification so the runner fails closed on wrong targets and no longer treats reviewer passes as mutation success. Targeted benchmark and bridge tests passed; live verification was deferred because no Hybrid session was connected.", "whatWasImplemented": "Updated the live-review runner to keep metadata/state/events tied to the same resolved Hybrid session, tightened reviewer receipt classification, and corrected mismatch logic so reviewer-only success is treated as non-mutation evidence. Added focused tests around session selection, artifact expectations, and classification branches.", "whatWasLeftUndone": "A real Hybrid live run is still needed to confirm the runner behavior against an actual pane/session on 4018.", "verification": { "commandsRun": [ { "command": "pnpm --filter @office-agents/sdk exec vitest run tests/word-agent-benchmark-suite.test.ts", "exitCode": 0, "observation": "Live-review contract tests passed." }, { "command": "pnpm --filter @office-agents/bridge exec vitest run tests/session-selection.test.ts tests/cli-commands.test.ts", "exitCode": 0, "observation": "Bridge selection and CLI tests passed." }, { "command": "pnpm typecheck", "exitCode": 0, "observation": "Typecheck passed for the touched code." } ], "interactiveChecks": [ { "action": "Attempted bridge-first Hybrid validation on https://localhost:4018", "observed": "No connected Hybrid session was available, so live runner verification was deferred." } ] }, "tests": { "added": [ { "file": "packages/sdk/tests/word-agent-benchmark-suite.test.ts", "cases": [ { "name": "reviewer-only success is non-mutation evidence", "verifies": "Mismatch and issue logic no longer treat reviewer-only pass as live mutation success." } ] } ] }, "discoveredIssues": [ { "severity": "medium", "description": "Hybrid session availability remains an external dependency for end-to-end live validation." } ] }

Hybrid Live Review Worker

When to Use This Skill

Required Skills

Work Procedure

Hybrid Live Review Worker

When to Use This Skill

Required Skills

Work Procedure

Example Handoff

When to Return to Orchestrator

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns