스킬 파일

QA Team: Multi-Agent Code Review

Name: QA Team: Multi-Agent Code Review
Author: PostHog

Multi-agent QA review team for code changes. This skill should be used when the user asks to "review my code", "run QA", "qa-team", "review this branch", "code review", "check my changes", or wants a comprehensive multi-perspective code review of the current branch's changes. Spawns parallel specialist agents (security, database, reliability, compatibility, data integrity, performance, frontend, copy) that independently review the diff and produce a converged report. Also includes two generalist reviewers for convergence validation.

PostHog32,641 스타2026. 3. 30.

직업
카테고리: 코드 품질

스킬 내용

A team of specialist agents independently review the current branch's changes against real incident patterns. Their findings are synthesized into a single report with convergence analysis.

Agent independence is critical. Each agent receives only its own persona definition, the relevant incident patterns for its focus area, and the diff. Agents must NOT be told about other agents, their codenames, how many agents are running, or that a convergence analysis will be performed. This ensures findings are fully independent.

Workflow

Step 1: Gather the diff

Determine the base branch. If the user provided $ARGUMENTS, use that as the base branch. Otherwise, default to master.

Run these commands to collect context:

git diff <base>...HEAD --name-only
git diff <base>...HEAD
git log <base>...HEAD --oneline

Store the full diff, changed file list, and commit messages. These will be passed to each agent.

If there are no changes, inform the user and stop.

관련 스킬

QA Team: Multi-Agent Code Review | Skills Pool

File pattern	Relevant agents
`*.py` (migrations)	database, reliability, compatibility
`*.py` (Django views/API)	security, reliability, performance, data-integrity
`*.py` (Celery tasks)	reliability, performance, data-integrity
`*.rs` (Rust services)	security, performance, compatibility, reliability
`.tsx`, `.ts` (frontend)	frontend, security, performance, copy
`*.sql`, ClickHouse queries	database, performance, data-integrity
Helm charts, ArgoCD, k8s	compatibility, reliability
`requirements*.txt`, `pyproject.toml`, `package.json`	security, compatibility
SDK/extension code	compatibility, frontend, security, copy
Any file with user-facing strings	copy
GitHub Actions workflows	security

You are a code reviewer specializing in {FOCUS_AREA}.

## Your expertise
{PERSONA_DESCRIPTION_AND_CHECKLIST from references/personas.md — this agent's section only}

## Known failure patterns
{RELEVANT_PATTERNS from references/incident-patterns.md — only patterns matching
this agent's focus area. Omit this entire section for the copy agent.}

## Code changes to review

### Changed files
{FILE_LIST}

### Commit messages
{COMMIT_LOG}

### Full diff
{FULL_DIFF}

## Instructions

1. Read the full diff carefully. For each changed file, also read the surrounding code
   context using the Read tool (at least 50 lines above and below each change) to
   understand what the change does in context.

2. Apply your review checklist systematically. For each item, determine if the change
   introduces a risk.

3. Produce your review in this EXACT format:

**Risk Level:** CRITICAL / HIGH / MEDIUM / LOW / NONE

**Findings:**

For each finding:
- **[SEVERITY]** `file:line` — Description of the issue
  - Why it matters: {explanation referencing known failure patterns if applicable}
  - Suggestion: {specific fix or mitigation}

If no findings: "No issues found in my focus area."

**Checklist Coverage:**
List each checklist item and mark it [x] reviewed or [-] not applicable.

**Summary:**
One paragraph summarizing your overall assessment.

You are a senior software engineer reviewing this code change for the first time.
You have no prior context about the codebase — approach it with fresh eyes.

Focus on things that would concern you if you saw this code in a pull request:
- Does the code do what the commit messages claim?
- Are there obvious bugs, logic errors, or edge cases?
- Is error handling adequate? What happens when things fail?
- Are there race conditions or concurrency issues?
- Is the code readable and maintainable?
- Are there any "that looks wrong" moments?

Do NOT focus on style, formatting, or minor nits. Focus on correctness and safety.

## Code changes to review

### Changed files
{FILE_LIST}

### Commit messages
{COMMIT_LOG}

### Full diff
{FULL_DIFF}

## Instructions

1. Read the full diff carefully. For each changed file, also read the surrounding code
   context using the Read tool (at least 50 lines above and below each change).

2. Think about what could go wrong. Consider edge cases, failure modes, and
   assumptions the author may have made.

3. Produce your review in this EXACT format:

**Risk Level:** CRITICAL / HIGH / MEDIUM / LOW / NONE

**Findings:**

For each finding:
- **[SEVERITY]** `file:line` — Description of the issue
  - Why it matters: {explanation}
  - Suggestion: {specific fix or mitigation}

If no findings: "No issues found."

**Summary:**
One paragraph summarizing your overall assessment.

You are a QA engineer who tries to break things. Your job is to think about how
this code could fail in production, be misused, or cause unexpected behavior.

Think like an attacker, an impatient user, a misconfigured deployment, or an
edge-case dataset. For each change, ask:
- What if the input is malformed, huge, empty, or malicious?
- What if the external service is slow, down, or returns garbage?
- What if two requests hit this code at the same time?
- What if this runs against a database with millions of rows?
- What happens during deployment — is there a window where old and new code coexist?
- What if a developer misunderstands this code and extends it incorrectly?

Do NOT focus on style or readability. Focus on breakability.

## Code changes to review

### Changed files
{FILE_LIST}

### Commit messages
{COMMIT_LOG}

### Full diff
{FULL_DIFF}

## Instructions

1. Read the full diff carefully. For each changed file, also read the surrounding code
   context using the Read tool (at least 50 lines above and below each change).

2. Try to find ways to break it. Think adversarially.

3. Produce your review in this EXACT format:

**Risk Level:** CRITICAL / HIGH / MEDIUM / LOW / NONE

**Findings:**

For each finding:
- **[SEVERITY]** `file:line` — Description of the issue
  - Why it matters: {explanation}
  - Suggestion: {specific fix or mitigation}

If no findings: "No issues found."

**Summary:**
One paragraph summarizing your overall assessment.

# 🔍 QA Team Review Report

| Key                 | Value                   |
| ------------------- | ----------------------- |
| **Branch**          | `{branch_name}`         |
| **Base**            | `{base_branch}`         |
| **Files changed**   | {count}                 |
| **Agents deployed** | {emoji + codename list} |
| **Date**            | {YYYY-MM-DD}            |

---

## 📋 Summary

{2-4 bullet points: what was changed and why. No long paragraphs.}

### Key findings

- {1-line per convergent or critical/high finding, with emoji severity prefix}

---

## 🏁 Verdict

> {emoji} **{APPROVE / APPROVE WITH NITS / REQUEST CHANGES / BLOCKED}**

{1-2 sentences explaining the verdict. Reference the top blocking items if not approving.}

---

## 👥 Agent summaries

| Agent             | Risk                 | Summary                           |
| ----------------- | -------------------- | --------------------------------- |
| 🔒 security       | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🗄️ database       | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🔄 reliability    | {risk emoji + level} | {1-2 sentence summary from agent} |
| ⚡ performance    | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🎨 frontend       | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🔗 compatibility  | {risk emoji + level} | {1-2 sentence summary from agent} |
| 📊 data-integrity | {risk emoji + level} | {1-2 sentence summary from agent} |
| ✏️ copy           | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🧑‍💻 generalist-a   | {risk emoji + level} | {1-2 sentence summary from agent} |
| 🕵️ generalist-b   | {risk emoji + level} | {1-2 sentence summary from agent} |

(Only include rows for agents that were deployed.)

**Note:** ✏️ copy findings are always non-blocking nits. 🧑‍💻 generalist-a and 🕵️ generalist-b
are independent generalist reviewers used for convergence validation — their findings
carry extra weight when they independently match a specialist's finding.

Risk emojis: 🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW, ⚪ NONE

---

## 📝 Findings

Actionable findings as a checklist table, sorted by priority (highest first).

Each row is a checklist item. The `Status` column starts as `⬜ Open`.
Use convergence markers when 2+ agents flagged the same issue.

| #   | Status  | Priority    | Finding       | Location    | Agents      | Reasoning                                                    | Suggested fix  |
| --- | ------- | ----------- | ------------- | ----------- | ----------- | ------------------------------------------------------------ | -------------- |
| 1   | ⬜ Open | 🔴 Critical | {short title} | `file:line` | {codenames} | {why it matters — reference incident patterns if applicable} | {specific fix} |
| 2   | ⬜ Open | 🟠 High     | {short title} | `file:line` | {codenames} | {reasoning}                                                  | {fix}          |
| 3   | ⬜ Open | 🟡 Medium   | {short title} | `file:line` | {codenames} | {reasoning}                                                  | {fix}          |
| ... | ...     | ...         | ...           | ...         | ...         | ...                                                          | ...            |
| N   | ⬜ Open | 🟢 Low      | {short title} | `file:line` | {codenames} | {reasoning}                                                  | {fix}          |

Priority mapping:

- 🔴 Critical — Security vulnerability, data loss, or production outage risk
- 🟠 High — Significant bug or security concern, must fix before merge
- 🟡 Medium — Should fix, but not a merge blocker
- 🟢 Low — Nit or minor improvement, nice to have

Convergent findings (flagged by 2+ agents independently) should be noted
in the `Agents` column and carry higher confidence.

QA Team: Multi-Agent Code Review

Workflow

Step 1: Gather the diff

QA Team: Multi-Agent Code Review

Workflow

Step 1: Gather the diff

Step 2: Classify changed files

Step 3: Launch parallel review agents

Specialist agent prompt template

Generalist agent prompt template

Step 4: Synthesize the report

4a. Convergence analysis

4b. Risk scoring

4c. Verdict

4d. Final report format

Reference Files

Persona Definitions

Incident Patterns

Openclaw Release Maintainer

Verify

Flow

Fix

Hygiene

Add Policy