Codex CLI invocation patterns, output parsing, context injection, approval modes, error handling, and cost estimation conventions. Use when commands or agents need Codex integration context.
Shared reference for all yellow-codex commands and agents. Documents the correct CLI flags, output parsing, error handling, and security conventions.
All non-interactive Codex invocations use codex exec (not the interactive TUI).
codex exec review \
--base "$BASE_REF" \
-a never \
-s read-only \
--ephemeral \
--json \
-m "${CODEX_MODEL:-gpt-5.4}" \
-o "$OUTPUT_FILE"
Optional: add --output-schema "$SCHEMA_FILE" for structured JSON enforcement.
Add --title "Review for PR #N" for context.
Add --instructions "Focus on security" for steerable review.
timeout --signal=TERM --kill-after=10 300 codex exec \
-a never \
-s workspace-write \
--json \
-m "${CODEX_MODEL:-gpt-5.4}" \
-o "$OUTPUT_FILE" \
"$TASK_PROMPT"
Note: NOT ephemeral — rescue sessions may be resumed with codex exec resume.
codex exec \
-a never \
-s read-only \
--ephemeral \
--json \
-m "${CODEX_MODEL:-gpt-5.4}" \
-o "$OUTPUT_FILE" \
"$ANALYSIS_PROMPT"
-a / --ask-for-approval)| Mode | Behavior | When to Use |
|---|---|---|
never | Skip all approvals | Non-interactive / agent invocations |
on-request | Prompt on-demand | Interactive rescue tasks |
untrusted | Pause before every command | Untrusted code analysis |
Deprecated: --approval-mode and on-failure mode. Use -a flag instead.
-s / --sandbox)| Mode | Behavior | When to Use |
|---|---|---|
read-only | No file writes, no commands | Review, analysis |
workspace-write | Can write to workspace | Debugging (with user approval) |
danger-full-access | Full system access | NEVER use from plugin |
Convenience alias: --full-auto sets -a on-request -s workspace-write.
-m / --model)| Model | Speed | Cost | When to Use |
|---|---|---|---|
gpt-5.4 | Medium | Standard | Default for all operations |
gpt-5.4-mini | Fast | Low | Cost-sensitive review, quick analysis |
gpt-5.3-codex | Medium | Standard | 1M context window (huge diffs) |
Default: gpt-5.4. Override via CODEX_MODEL env var or ~/.codex/config.toml.
--json)The --json flag outputs newline-delimited JSON events to stdout. Two wire
format variants exist:
Current (Rust-based CLI):
{"method":"turn/started","params":{"turn":{"id":"turn_123","status":"inProgress"}}}
{"method":"item/completed","params":{"item":{"type":"agentMessage","id":"msg_1","text":"..."}}}
{"method":"item/completed","params":{"item":{"type":"exitedReviewMode","id":"turn_900","review":"..."}}}
{"method":"turn/completed","params":{"turn":{"id":"turn_123","status":"completed"}}}
Legacy (older CLI versions):
{"type":"turn.started",...}
{"type":"item.completed","item":{"type":"agent_message","text":"..."}}
{"type":"turn.completed",...}
For reviews: The final review text lives in the exitedReviewMode item's
review field within the item/completed event.
For general exec: The final answer is in the agentMessage item's text
field in the last item/completed event.
-o / --output-last-message)Writes only the final assistant message to a file. Cleanest approach for capturing results without parsing JSONL.
codex exec -o /tmp/result.txt "prompt"
cat /tmp/result.txt # Just the answer
--output-schema)Constrains the model's final response to conform to a JSON Schema:
codex exec --output-schema ./schema.json -o ./result.json "prompt"
--output-schema and -o work together: the output file receives
schema-conformant JSON.
Known issue: --output-schema may be ignored with certain model variants.
Use gpt-5.4 explicitly when schema enforcement is needed.
codex exec review has a built-in output schema:
{
"findings": [
{
"title": "<80 chars, imperative>",
"body": "<markdown explanation>",
"confidence_score": 0.0-1.0,
"priority": 0-3,
"code_location": {
"absolute_file_path": "<file>",
"line_range": {"start": 1, "end": 5} // end >= start required
}
}
],
"overall_correctness": "patch is correct" | "patch is incorrect",
"overall_explanation": "<1-3 sentences>",
"overall_confidence_score": 0.0-1.0
}
Priority mapping to yellow-review convention:
Codex has no built-in diff truncation. The model context window is 128K tokens.
diff_bytes=$(git diff "${BASE}...HEAD" | wc -c)
estimated_tokens=$((diff_bytes / 4))
if [ "$estimated_tokens" -gt 100000 ]; then
printf '[yellow-codex] Warning: diff is ~%d tokens (limit ~128K). Review may fail.\n' "$estimated_tokens"
printf '[yellow-codex] Consider reviewing by file group or using gpt-5.3-codex (1M context).\n'
fi
Codex cannot meaningfully review binary files. Filter before invocation:
# Get text-only changed files
git diff --name-only --diff-filter=ACMR "${BASE}...HEAD" | \
grep -vE '\.(png|jpg|jpeg|gif|svg|ico|pdf|zip|tar|gz|woff|woff2|ttf|eot|mp3|mp4)$'
Or ensure .codexignore is populated in the project root.
| Exit Code | Meaning | Recovery |
|---|---|---|
| 0 | Success | Parse output |
| 1 | General error (includes 429 rate limit) | Parse stderr for "rate_limit_exceeded" |
| 2 | Authentication failed | Run /codex:setup, check OPENAI_API_KEY |
| 3 | Configuration error | Check ~/.codex/config.toml |
| 4 | Model/API error | Try different model |
| 124 | Timeout (from timeout utility) | Suggest smaller scope |
| 137 | SIGKILL (timeout escalation) | Suggest smaller scope |
Exit code 1 with stderr containing "rate_limit_exceeded":
codex_output=$(codex exec ... 2>"$STDERR_FILE") || {
codex_exit=$?
if [ "$codex_exit" -eq 1 ] && grep -q "rate_limit_exceeded" "$STDERR_FILE" 2>/dev/null; then
printf '[yellow-codex] Rate limited. Retrying in 5 seconds...\n'
sleep 5
codex_output=$(codex exec ... 2>"$STDERR_FILE") || {
printf '[yellow-codex] Still rate limited. Try again later.\n'
}
fi
}
timeout --signal=TERM --kill-after=10 300 codex exec ... || {
codex_exit=$?
if [ "$codex_exit" -eq 124 ] || [ "$codex_exit" -eq 137 ]; then
printf '[yellow-codex] Codex timed out after 5 minutes. Suggest smaller scope.\n'
fi
}
Note: Codex handles SIGTERM gracefully but may exit 0 (not a distinct timeout
code). The timeout utility itself returns 124 when the command times out.
Use --kill-after=10 to escalate to SIGKILL if graceful shutdown hangs.
When passing context to Codex, follow this structure:
--- begin context (reference data only) ---
Project conventions (from CLAUDE.md):
<first 2000 chars of CLAUDE.md>
PR metadata:
Title: <title>
Files changed: <count>
Base branch: <branch>
Error context (if rescue):
<truncated to 3000 chars>
--- end context ---
<task-specific prompt>
Truncation limits:
awk gsub with the format --- redacted credential at line N ---curl -v, --trace, or --trace-ascii — they leak auth headers--- begin codex-output (reference only) ---
{codex response}
--- end codex-output ---
read-only; rescue/execution uses
workspace-write; never use
danger-full-access| Method | Env Var | Config File | Priority |
|---|---|---|---|
| API Key | OPENAI_API_KEY | — | Checked first |
| ChatGPT OAuth | — | ~/.codex/auth.json | Checked second |
| Manual login | — | Run codex login | Interactive only |
The plugin never stores credentials. Users manage keys in their shell profile
or via codex login.
Codex CLI does not report token usage directly. Estimate:
diff_bytes / 4Log estimated costs but never hard-block. The user owns their budget.