Run and debug integration tests for any agent in this plugin. Use when the user says "test the agent", "run tests for privacy-guard", "debug agent tests", "validate privacy guard", or after modifying any agent definition in agents/.
Run integration tests for any agent in this plugin. Each test spawns a
real claude --agent process against a temporary git repo with planted
PII and verifies the structured JSON output.
| Agent | Test dir | What it tests |
|---|---|---|
privacy-guard | tests/integration/privacy_guard/ | Pre-push scope: staged diffs, unstaged diffs, unpushed commits |
privacy-audit | tests/integration/privacy_audit/ | Full audit: git history, pattern detection across files and commits |
Use ./agent test — it handles venv setup, debug logging, filtering,
and parallelism. See docs/AGENT-CLI.md for full reference.
./agent test <agent-name> -k <test_name> --debug
Each test takes 15s-3min (spawns a real agent). Run one at a time to get fast feedback. Stop at the first failure.
./agent test <agent-name> --debug
./agent test <agent-name> -n 5
Only use parallel after individual tests pass. Parallel runs give no incremental feedback.
--debug writes two log files per test to /tmp/privacy-guard-tests/:
| File | Contents |
|---|---|
<repo-name>.log | Test harness log: commands run, raw agent output, parsed JSON |
<repo-name>.claude-debug.log | Claude internals: tool calls, model responses |
Watch in real time:
tail -f /tmp/privacy-guard-tests/*.log
Run cheap tests first to catch problems early:
test_missing_person_md — agent fails fast, ~15stest_clean_repo_no_findings — scans but finds nothingtest_email_in_staged_file — basic staged detectiontest_email_in_unstaged_change — unstaged detectiontest_email_in_unpushed_commit_diff — unpushed commit detectiontest_pii_in_unpushed_commit_message — commit message detectiontest_pushed_pii_not_found — scope boundary: pushed PII invisibletest_does_not_read_individual_files — scope boundary: no file readstest_untracked_files_trigger_warning — untracked file warningtest_completed_scan_has_required_fields — JSON structuretest_failed_scan_has_required_fields — failure JSON structuretest_scan_scope_reflects_what_was_checked — scan_scope accuracytest_via_plugin.py::test_agent_runs_via_plugin_dir — plugin discoveryCheck the harness log:
cat /tmp/privacy-guard-tests/<repo-name>.log
Is structured JSON present? Look for the privacy-guard-result
fenced block in the raw output. If missing, the agent didn't follow
its output instructions.
JSON present but wrong findings? Compare matched_value and
category fields against what was planted in the test fixture
(see conftest.py for fixture definitions).
Agent doing unexpected things? Check the Claude debug log:
cat /tmp/privacy-guard-tests/<repo-name>.claude-debug.log
Look for: tool calls the agent made, whether it read files it
shouldn't have, whether it chained commands with &&.
Fix the agent .md or the test fixture, not both at once.
test_via_plugin.py only