Run comprehensive test cases against a platform agent to verify all capabilities — filesystem, git, helm, platform API, plugins, compliance, and reporting
Run structured test cases against a platform agent to verify it has full access and all capabilities work correctly. Creates a task in the platform, assigns it to the target agent, monitors the run, then reviews the transcript for pass/fail.
/neb-test-agent [agent-name] [test-suite]
agent-name: CEO, DevOps, HR, Marketing, Sales, Compliance, Executive (default: DevOps)test-suite: all, filesystem, git, helm, platform, plugins, compliance (default: all)NEVER call the AI server directly via kubectl exec, port-forward, or localhost. ALWAYS use the platform-backend AI proxy. This is the only supported path for operator API calls.
All Paperclip API calls must go through the platform-backend AI proxy. The proxy path pattern:
${NEB_TASK_API_URL}/api/v1/orgs/${NEB_TASK_COMPANY_ID}/ai/{paperclip-path}
Construct the base URL at the start of execution:
source "$(git rev-parse --show-toplevel)/.env"
AI_API="${NEB_TASK_API_URL}/api/v1/orgs/${NEB_TASK_COMPANY_ID}/ai"
AUTH_HEADER="Authorization: Bearer ${NEB_TASK_API_KEY}"
Then all API calls use ${AI_API} as the base:
${AI_API}/agents/me → proxied to Paperclip /api/agents/me${AI_API}/companies/${NEB_TASK_COMPANY_ID}/issues → proxied to /api/companies/{id}/issues${AI_API}/companies/${NEB_TASK_COMPANY_ID}/agents → proxied to /api/companies/{id}/agents${AI_API}/health → proxied to /api/healthThe JWT (issuer: paperclip) is passed through by platform-backend without re-validation — Paperclip validates it on the upstream side.
The agent must:
CLAUDE.md from the main nebinfra repo (~/sources/nebinfra/CLAUDE.md)oss/ directory (~/oss/paperclip/package.json)nebcore/helm-library/charts/k8s/Chart.yaml)business/business-operations/)gitops/gitops-platform-shared/)/tmp/agent-test-{timestamp}.txt) and verify it exists.claude/references/paperclip-api.mdThe agent must:
git status in the main nebinfra repogit log --oneline -3 to see recent commitsgit submodule status | wc -l to count submodules (expect 34)git branch --show-current to check current branchgit -C nebcore/helm-library log --oneline -1 to verify submodule accessgit config user.name and git config user.email to verify identityThe agent must:
helm version --short (expect v4.x)kubectl version --client (expect v1.x)go version (expect go1.x)node --version (expect v2x)gh --version (expect gh 2.x)python3 --version (expect 3.x)pnpm --version (expect 10.x)The agent must (using its platform credentials):
POST /api/plugins/nebinfra.platform-integration/data/searchInvoke
/neb-helm-validatefor helm testing scenarios.
The agent must:
/neb-helm-validate on a chart directory (e.g., nebcore/helm-modules-apps/charts/argo-cd/) and verify it passes/neb-helm-validate output includes lint and dependency update resultshelm-charts-releases.yaml and find a chart versionhelm template on a small chart and verify output contains YAMLThe agent must:
The agent must:
source "$(git rev-parse --show-toplevel)/.env"
# Construct AI proxy base URL
# Platform-backend proxies: /api/v1/orgs/{companyId}/ai/* → Paperclip /api/*
AI_API="${NEB_TASK_API_URL}/api/v1/orgs/${NEB_TASK_COMPANY_ID}/ai"
AUTH="-H 'Authorization: Bearer ${NEB_TASK_API_KEY}'"
First verify the AI server is reachable:
curl -sS "${AI_API}/health" -H "Authorization: Bearer ${NEB_TASK_API_KEY}"
Then look up the target agent by name from the company agents list:
curl -sS "${AI_API}/companies/${NEB_TASK_COMPANY_ID}/agents" \
-H "Authorization: Bearer ${NEB_TASK_API_KEY}"
Get the agent ID from the matching agent.
Invoke
/superpowers:test-driven-developmentto define test suites with assertions and expected outcomes.
curl -sS -X POST "${AI_API}/companies/${NEB_TASK_COMPANY_ID}/issues" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${NEB_TASK_API_KEY}" \
-d '{
"title": "[Agent Test] <agent-name>: Comprehensive capability verification",
"description": "<full test spec from selected suites>",
"status": "todo",
"priority": "high",
"assigneeAgentId": "<agent-id>"
}'
Invoke
/superpowers:dispatching-parallel-agentswhen triggering agent heartbeat for test execution.
Poll heartbeat runs until a new run appears for this agent and completes (succeeded or failed). Timeout: 10 minutes.
curl -sS "${AI_API}/companies/${NEB_TASK_COMPANY_ID}/heartbeat-runs?limit=1" \
-H "Authorization: Bearer ${NEB_TASK_API_KEY}"
Fetch the run transcript via:
curl -sS -X POST "${AI_API}/plugins/nebinfra.platform-integration/data/run-transcript" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${NEB_TASK_API_KEY}" \
-d '{"params": {"runId": "<run-id>", "companyId": "<company-id>"}}'
Fetch the agent's posted comments:
curl -sS "${AI_API}/issues/<issue-id>/comments" \
-H "Authorization: Bearer ${NEB_TASK_API_KEY}"
Look for the structured test report in the comments.
Invoke
/superpowers:verification-before-completionto review agent transcript and test output before declaring pass/fail.
Parse the agent's report comment for:
Print a summary:
=== Agent Test Results: DevOps ===
Suite | Pass | Fail | Skip
-------------------|------|------|-----
Filesystem | 8/8 | 0 | 0
Git Operations | 6/6 | 0 | 0
Tool Access | 7/7 | 0 | 0
Platform API | 8/8 | 0 | 0
Helm Operations | 5/5 | 0 | 0
Compliance | 4/4 | 0 | 0
Reporting | 3/3 | 0 | 0
-------------------|------|------|-----
TOTAL | 41/41| 0 | 0
For any FAIL results:
get-run-transcriptThe task assigned to the agent uses this template:
## Objective
Run the NebInfra agent capability test suite. Execute each check, record PASS/FAIL, and post a structured report.
## Instructions
For each check below, attempt the operation and record the result as PASS or FAIL with a brief note.
### Filesystem Access
1. Read ~/sources/nebinfra/CLAUDE.md — first 3 lines
2. Read ~/oss/paperclip/package.json — extract "name" field
3. Read nebcore/helm-library/charts/k8s/Chart.yaml — extract version
4. List files in business/business-operations/ (first 5)
5. List files in gitops/gitops-platform-shared/ (first 5)
6. List files in your workspace root
7. Create /tmp/agent-test-{timestamp}.txt and verify
8. Read .claude/references/paperclip-api.md — first 3 lines
### Git Operations
1. git status (in nebinfra root)
2. git log --oneline -3
3. git submodule status | wc -l (expect ~34)
4. git branch --show-current
5. git -C nebcore/helm-library log --oneline -1
6. git config user.name && git config user.email
### Tool Access
1. helm version --short
2. kubectl version --client --short
3. go version
4. node --version
5. gh --version
6. python3 --version
7. pnpm --version
### Platform API
1. Fetch your own agent details via API
2. Fetch this task's heartbeat-context
3. Post a "test in progress" comment on this task
4. List all company agents
5. List issues assigned to you
6. Search issues for "test" via search plugin
7. Create a subtask "Test subtask" under this task
8. Mark the subtask done and delete it
### Helm Operations
1. Invoke /neb-helm-validate on nebcore/helm-modules-apps/charts/argo-cd and verify it passes
2. Verify /neb-helm-validate output includes lint and dependency update results
3. Read helm-charts-releases.yaml — find argo-cd version
4. Read nebcore/helm-modules-apps/charts/argo-cd/Chart.yaml — extract version
5. helm template test nebcore/helm-modules-apps/charts/argo-cd | head -20
### Compliance
1. Entity lookup: USA (expect "Nebinfra Technologies Inc")
2. Entity lookup: India (expect "Altisnebinfra Technologies Pvt Ltd")
3. Employment bond check: USA document with bond clause (expect VIOLATION)
4. Employment bond check: India document with bond clause (expect PASS)
### Reporting
Post a comment with this format:
| Suite | Check | Result | Notes |
|---|---|---|---|
| Filesystem | Read CLAUDE.md | PASS/FAIL | first line content or error |
| ... | ... | ... | ... |
Then set this task to done (if all pass) or blocked (if any fail).
Before running agent capability tests, verify the operator can reach the platform:
source "$(git rev-parse --show-toplevel)/.env"
AI_API="${NEB_TASK_API_URL}/api/v1/orgs/${NEB_TASK_COMPANY_ID}/ai"
AUTH="Authorization: Bearer ${NEB_TASK_API_KEY}"
GET ${NEB_TASK_API_URL}/healthz — platform-backend liveness (no auth)GET ${AI_API}/health — AI server health via proxy (JWT auth)GET ${AI_API}/agents/me — agent identity (JWT auth)GET ${AI_API}/companies/${NEB_TASK_COMPANY_ID}/agents — list agentsGET ${AI_API}/companies/${NEB_TASK_COMPANY_ID}/dashboard — company healthIf any fail, STOP and diagnose. Common issues:
Reference: .claude/memory/2026-04-04-ai-agent-e2e-findings.md
| Issue | Severity | Impact | Root Cause |
|---|---|---|---|
| Process adapter missing PAPERCLIP_RUN_ID | CRITICAL | Heartbeats can't start | buildPaperclipEnv() omits runId; process adapter doesn't add it |
| All write ops return HTTP 500 | CRITICAL | Issue creation/comments return errors despite persisting | Post-mutation lifecycle error in Paperclip |
| Checkout leaves checkoutRunId=null | CRITICAL | Blocks all mutations on in_progress issues | Checkout 500s during processing, runId not stored |
| Agent stuck in "error" status | HIGH | Agent won't accept new heartbeats | Failed heartbeat sets error status permanently |
| Agent runtime-state requires Board access | MEDIUM | Agent JWT can't read own runtime state | Permission gate too strict |
| Cost plugin degraded | MEDIUM | Cost queries may fail | OpenCost service unreachable |
Workaround for 500-on-writes: Despite the 500, data IS persisted. You can verify via GET after a failed POST/PATCH. For E2E testing, treat 500 as "write succeeded with lifecycle error" and verify via read.
Workaround for ownership conflict: Avoid using checkout when running from operator JWT. Instead, use PATCH to change status directly (also returns 500 but persists).
| Superpowers Skill | When to Use |
|---|---|
/superpowers:test-driven-development | Core function — defines test suites with assertions and validates results |
/superpowers:verification-before-completion | Before declaring pass/fail — review agent transcript and test output |
/superpowers:dispatching-parallel-agents | When triggering agent heartbeat for test execution |