Capture and analyze logs from tmux sessions when Model A/B is running or completed. Monitor progress, check errors, extract test results.
Capture logs from tmux session, analyze running state, extract important information.
Tmux session name via $ARGUMENTS. Default: model_a or model_b.
Options:
--lines N: Number of lines to capture (default: 100)--save path: Save logs to file--grep pattern: Filter logs by pattern--compare session2: Compare 2 sessions### Status: [RUNNING / COMPLETED / ERROR / NOT_FOUND]
### Summary
- Runtime: Xm Ys (estimated)
- Last line: [last output line]
- Test results: [if detected]
- Errors: [if any]
### Recent Logs (last N lines)
[logs content]
### Analysis
[Status assessment, issues detected, next steps]
Run:
tmux has-session -t <session_name> 2>/dev/null && echo "EXISTS" || echo "NOT_FOUND"
If NOT_FOUND:
tmux list-sessions# Capture last N lines from buffer
tmux capture-pane -t <session_name> -p -S -<N>
If more than default buffer needed:
# Capture entire scrollback buffer
tmux capture-pane -t <session_name> -p -S -
If --grep specified:
tmux capture-pane -t <session_name> -p -S - | grep -E "<pattern>"
Determine model status from logs:
RUNNING:
COMPLETED:
ERROR:
IDLE:
Auto-detect and extract:
Test Results:
X passed, X failed, X errorsPASS, FAIL, ERROR with test namespytest, jest, go test, cargo test outputErrors and Warnings:
Error, error, ERROR, Exception, Warning, WARNRuntime:
Files Changed:
git diff --stat, modified:, new file:tmux capture-pane -t <session_name> -p -S - > <save_path>
Add metadata header:
# Logs captured from tmux session: <session_name>
# Captured at: <timestamp>
# Lines: <count>
# Status: <RUNNING/DONE/ERROR>
---
<logs content>
When comparing 2 sessions:
### Comparison: Model A vs Model B
| Criteria | Model A | Model B |
|----------|---------|---------|
| Status | COMPLETED | RUNNING |
| Runtime | 5m 23s | 3m 10s (running) |
| Tests passed | 25/25 | 18/20 (2 failed) |
| Errors | 0 | 2 |
| Files changed | 5 | 7 |
### Key Differences
- Model A completed first with 0 errors
- Model B has 2 test failures: [test names]
- Model B changed more files (7 vs 5)
/get-logs model_a
/get-logs model_a --lines 20
# ... wait a few minutes ...
/get-logs model_a --lines 20
/get-logs model_a --grep "PASS\|FAIL\|passed\|failed\|error"
/get-logs model_a --save workspace/329_.../turn_1/logs_a.txt
/get-logs model_b --save workspace/329_.../turn_1/logs_b.txt
/get-logs model_a model_b --compare
/get-logs model_b --grep "Traceback\|Exception\|Error"
In the checkpoint review flow:
Before collecting diffs: Check both models finished
/get-logs model_a
/get-logs model_b
Save logs as evidence: Logs serve as execution_evidence
/get-logs model_a --save workspace/.../turn_N/logs_a.txt
Extract runtime: Get execution time for evaluation
/get-logs model_a --grep "real\|user\|sys\|elapsed"
Debug model failures: Check why a model failed
/get-logs model_b --grep "error\|fail" --lines 50
tmux set-option -g history-limit 50000command 2>&1 | tee logs.txt