Automates interactions for Apple-platform apps (iOS, tvOS, macOS) and Android devices. Use when navigating apps, taking snapshots/screenshots, tapping, typing, scrolling, or extracting UI info across mobile, TV, and desktop targets.
Use this skill as a router with mandatory defaults. Read this file first. For normal device tasks, always load references/bootstrap-install.md and references/exploration.md before acting. Use bootstrap to confirm or establish deterministic setup. Use exploration for UI inspection, interaction, and verification once the app session is open.
snapshot when the task is to verify what text or structure is currently visible on screen.snapshot -i only when you need interactive refs such as @e3 for a requested action or targeted query. On iOS and Android, default snapshot output uses the same visible-first model: off-screen interactive content is exposed as discovery hints, not tappable refs.diff snapshot after a nearby mutation when you only need to know what changed.scroll <direction> and re-snapshot until the target is visible.@ref or selector targeting over raw coordinates.open -> inspect/act -> verify if needed -> close.snapshot if the goal is to read or verify what is visible.snapshot -i only if you need refs for interactive exploration or a requested action.get, is, or find before mutating the UI when a read-only command can answer the question.close.snapshot when you need to verify whether text is visible.snapshot -i mainly for interactive exploration and choosing refs.diff snapshot for compact post-action verification; use snapshot --diff when that alias is easier to discover from snapshot help.get, is, or find when they can answer the question without changing UI state.fill to replace text.type to append text.type @eN "text". Use fill @eN "text" to target a field directly, or press @eN then type "text" when the field already has focus and you want append semantics.keyboard dismiss over navigation. On iOS, keep an app session open first; keyboard status|get remains Android-only.back for predictable app-owned navigation and reserve back --system for platform back gestures or button semantics.type --delay-ms or fill --delay-ms for debounced search fields that drop characters when typed too quickly.bootstrap-install.md instead of improvising setup steps.@ref or selector targeting over raw coordinates.connect --remote-config, or tenant leases on a remote macOS host: references/remote-tenancy.md
This includes remote React Native runs where agent-device now prepares Metro locally and manages the local Metro companion tunnel automatically.