Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed.
Encourage use of multiple agents/subagents when it is likely to improve speed, quality, or confidence.
Split work into clear packets with owners, inputs, acceptance checks, and a synthesis step when parallelizing.
Use single-agent execution when scope is small or coordination overhead outweighs gains.
Proactive autonomy and knowledge compounding
Be proactive: immediately take the next highest-value in-scope action when it is clear.
Default to autonomous execution: do not pause for confirmation between normal in-scope steps.
Request user input only when absolutely necessary: ambiguous requirements, material risk tradeoffs, missing required data/access, or destructive/irreversible actions outside policy.
If blocked by command/tool/env failures, attempt high-confidence fallbacks autonomously before escalating (for example helper script -> direct OS command, python -> , app-name capture -> window-id capture).
Related Skills
python3
When the workflow uses plan/, ensure required plan directories exist before reading/writing them (create when edits are allowed; otherwise use an in-memory fallback and call it out).
Treat transient external failures (network/SSH/remote APIs/timeouts) as retryable by default: run bounded retries with backoff and capture failure evidence before concluding blocked.
On repeated invocations for the same objective, resume from prior findings/artifacts and prioritize net-new progress over rerunning identical capture steps unless verification requires reruns.
Drive work to complete outcomes with verification, not partial handoffs.
Treat iterative execution as the default for non-trivial work; run adaptive loop passes. Example loops (adapt as needed, not rigid): capture preflight -> capture -> inspect -> adjust -> recapture; compare capture -> inspect -> contrast -> annotate/report; docs audit -> update -> verify -> re-audit.
Keep looping until actual completion criteria are met: no actionable in-scope next step remains, verification is green, and confidence is high.
Run organise-docs frequently during execution to capture durable decisions and learnings, not only at the end.
Create small checkpoint commits frequently with git-commit when changes are commit-eligible, checks are green, and repo policy permits commits.
Never squash commits; always use merge commits when integrating branches.
Prefer simplification over added complexity: aggressively remove bloat, redundancy, and over-engineering while preserving correctness.
When you touch code, leave the touched area in a better state than you found it: clearer, simpler, tidier, and at least as performant unless the task requires an explicit trade-off.
Use simple, plain English in user messages, docs, notes, reports, code comments, and other explanatory writing. Avoid jargon, fancy wording, and complex phrasing. When a technical term is needed for correctness, explain it in simple words the first time. Default to short user-facing responses. Think about what the user most wants to know, and lead with that. Do not dump every detail by default. Always include important changes, blockers, verification gaps, and any important assumptions, nuances, principles, or decisions that shaped the work. Add more detail only when the user asks for it or when uncertainty or risk makes it necessary.
Compound knowledge continuously: keep docs/ accurate and up to date, and promote durable learnings and decisions from work into docs.
Long-task checkpoint cadence
For any non-trivial task (including long efforts), run recurring checkpoint cycles instead of waiting for a single end-of-task wrap-up.
At each meaningful milestone with commit-eligible changes, and at least once per major phase, invoke git-commit to create a small logical checkpoint commit once relevant checks are green and repo policy permits commits.
At the same cadence, invoke organise-docs whenever durable learnings/decisions appear, and prune stale plan/ scratch artifacts.
If either checkpoint is blocked (for example failing checks or low-confidence documentation), resolve or record the blocker immediately and retry before expanding scope.
Terminal state contract (must follow)
The skill is complete only when all of the following are true:
Objective completion: the user-requested screenshot or capture outcome is achieved, or explicitly marked blocked with concrete blocker evidence.
Workflow completion: every required workflow step is resolved as done, blocked, or not-applicable, with brief evidence or rationale.
Step-level terminal completion: each numbered subtask must have explicit completion evidence (artifact, command output, or written rationale) before advancing.
Verification completion: required checks/validations for this skill are executed, or any unavailable checks are explicitly called out with impact.
Findings completion (where applicable): report only evidence-backed findings; if no high-confidence critical findings are present, explicitly state that.
Loop completion: no actionable in-scope next step remains under the current objective.
Stop only after this terminal contract is satisfied; otherwise continue iterating.
Terminal state examples (adapt to skill)
done: requested screenshot(s) are captured to the right location and the saved path(s) are reported.
blocked: progress cannot continue after bounded retries because the OS capture tool, permission, or requested window/app target is unavailable; blocker evidence and exact unblock action are reported.
not-applicable: an OS-specific branch is explicitly skipped with reason.
Overview
Use this skill for desktop- or OS-level screenshots when the user explicitly asks for a screenshot, when whole-system capture is needed, or when tool-specific capture paths cannot access the target cleanly.
Supported host platforms are macOS and Linux only. Windows is intentionally unsupported.
Follow these save-location rules every time:
If the user specifies a path, save there.
If the user asks for a screenshot without a path, save to the OS default screenshot location.
If Codex needs a screenshot for its own inspection, save to the temp directory.
Tool priority
Prefer tool-specific screenshot capabilities when available (for example: a Figma MCP/skill for Figma files, or Playwright/agent-browser tools for browsers and Electron apps).
Use this skill when explicitly asked, for whole-system desktop captures, or when a tool-specific capture cannot get what you need.
Otherwise, treat this skill as the default for desktop apps without a better-integrated capture tool.
On macOS, run the preflight helper once before window/app capture. It checks
Screen Recording permission, explains why it is needed, and requests it in one
place.
The helpers route Swift's module cache to $TMPDIR/codex-swift-module-cache
to avoid extra sandbox module-cache prompts.
The script prints one path per capture. When multiple windows or displays match, it prints multiple paths (one per line) and adds suffixes like -w<windowId> or -d<display>. View each path sequentially with the image viewer tool, and only manipulate images if needed or requested.
Workflow examples
"Take a look at <App> and tell me what you see": capture to temp, then view each printed path in order.
"The design from Figma is not matching what is implemented": use a Figma MCP/skill to capture the design first, then capture the running app with this skill (typically to temp) and compare the raw screenshots before any manipulation.
Multi-display behavior
On macOS, full-screen captures save one file per display when multiple monitors are connected.
On Linux, full-screen captures use the virtual desktop (all monitors in one image); use --region to isolate a single display when needed.
Linux prerequisites and selection logic
The helper automatically selects the first available tool:
scrot
gnome-screenshot
ImageMagick import
If none are available, ask the user to install one of them and retry.
Coordinate regions require scrot or ImageMagick import.
--app, --window-name, and --list-windows are macOS-only. On Linux, use
--active-window or provide --window-id when available.
On macOS, run bash <path-to-skill>/scripts/ensure_macos_permissions.sh first to request Screen Recording in one place.
If you see "screen capture checks are blocked in the sandbox", "could not create image from display", or Swift ModuleCache permission errors in a sandboxed run, rerun the command with escalated permissions.
If macOS app/window capture returns no matches, run --list-windows --app "AppName" and retry with --window-id, and make sure the app is visible on screen.
If Linux region/window capture fails, check tool availability with command -v scrot, command -v gnome-screenshot, and command -v import.
If the host OS is neither macOS nor Linux, fail fast. Windows is intentionally unsupported.
If saving to the OS default location fails with permission errors in a sandbox, rerun the command with escalated permissions.
Always report the saved file path in the response.