Interactively controls XCUITest through a CLI. Claude reads UI state and screenshots, decides actions, and executes commands via the CLI. Use for dynamic UI exploration, complex navigation flows, or when pre-scripted navigation isn't feasible.
Enables Claude to dynamically control an iOS app through XCUITest using a CLI that abstracts the file-based protocol. Unlike pre-scripted tests, this allows Claude to explore the UI, make decisions based on current state, and recover from unexpected situations.
Invoke this skill when you need to:
The skill will ask for your goal if not specified (e.g., "Navigate to Settings and enable Dark Mode").
Add the package to your project via SPM:
// In Package.swift or via Xcode:
.package(url: "https://github.com/gestrich/xcode-sim-automation.git", from: "1.0.0")
Create a UI test in your project's UI test target:
import XCTest
import XCUITestControl
final class InteractiveControlTests: XCTestCase {
@MainActor
func testInteractiveControl() throws {
let app = XCUIApplication()
app.launch()
InteractiveControlLoop().run(app: app)
}
}
Clone the xcode-sim-automation repo to get the CLI tool:
git clone https://github.com/gestrich/xcode-sim-automation.git
The CLI is the wrapper script at Tools/xcuitest-control within the cloned repo. It auto-builds the Swift CLI binary on first run and whenever source files change — no manual build step needed.
A Python fallback (Tools/xcuitest-control.py) is also available if the Swift toolchain isn't installed.
The Tools/xcuitest-control script provides a simple interface for controlling XCUITest. It accepts the same flags as the Python CLI:
# Tap a button
Tools/xcuitest-control tap --target submitButton --target-type button
# Scroll down
Tools/xcuitest-control scroll --direction down
# Type text
Tools/xcuitest-control type --value "Hello World"
# Adjust a slider to 75%
Tools/xcuitest-control adjust --target volumeSlider --value 0.75
# Pinch to zoom in (scale > 1.0)
Tools/xcuitest-control pinch --scale 2.0 --target imageView
# Wait 2 seconds
Tools/xcuitest-control wait --value 2.0
# Take screenshot
Tools/xcuitest-control screenshot
# Check status
Tools/xcuitest-control status
# Exit the test
Tools/xcuitest-control done
Each command returns JSON with paths to the latest hierarchy and screenshot:
{
"status": "completed",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
On error:
{
"status": "error",
"error": "Element 'missingButton' not found after waiting 10 seconds",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
Run the interactive control test using xcodebuild:
xcodebuild test \
-workspace YourApp.xcworkspace \
-scheme "YourUITestScheme" \
-destination 'platform=iOS Simulator,name=iPhone 16 Pro' \
-only-testing:"YourUITestTarget/InteractiveControlTests/testInteractiveControl" &
Replace YourApp.xcworkspace, YourUITestScheme, and YourUITestTarget with your project's values.
The test will:
Poll until the hierarchy file exists:
while [ ! -f /tmp/xcuitest-hierarchy.txt ]; do sleep 1; done
Or use the status command:
Tools/xcuitest-control status
Use the CLI to execute actions:
# Read current UI state
cat /tmp/xcuitest-hierarchy.txt
# View screenshot
# Read file: /tmp/xcuitest-screenshot.png
# Execute action
Tools/xcuitest-control tap --target settingsButton --target-type button
# View updated screenshot after action
# Read file: /tmp/xcuitest-screenshot.png
When the goal is achieved:
Tools/xcuitest-control done
Taps an element by identifier.
Tools/xcuitest-control tap --target submitButton --target-type button
Tools/xcuitest-control tap -t submitButton -T button
Tools/xcuitest-control tap --target Edit --target-type button --index 0
Options:
--target, -t (required): Accessibility identifier of the element--target-type, -T (optional): Element type - button, staticText, cell, textField, slider, or any--index, -i (optional): 0-based index when multiple elements match. If omitted, taps the first hittable element.Scrolls content in a direction (reveals content in that direction).
Important: The direction specifies where you want to scroll TO (what content to reveal), not the swipe gesture direction:
--direction down = reveal content below (internally swipes up)--direction up = reveal content above (internally swipes down)--direction left = reveal content to the left (internally swipes right)--direction right = reveal content to the right (internally swipes left)Tools/xcuitest-control scroll --direction down # Scroll down to see more content below
Tools/xcuitest-control scroll -d up --target scrollView # Scroll up to see content above
Options:
--direction, -d (required): up, down, left, or right - the direction to scroll content--target, -t (optional): Element to scroll. If omitted, scrolls the app.Types text into a text field.
Tools/xcuitest-control type --value "[email protected]"
Tools/xcuitest-control type -V "Hello" --target usernameField
Options:
--value, -V (required): Text to type--target, -t (optional): Text field to type into. If omitted, types into currently focused field.Adjusts a slider to a normalized position (0.0 to 1.0).
Tools/xcuitest-control adjust --target volumeSlider --value 0.75
Tools/xcuitest-control adjust -t volumeSlider -V 0.5
Options:
--target, -t (required): Accessibility identifier of the slider--value, -V (required): Normalized position between 0.0 (minimum) and 1.0 (maximum)Examples:
--value 0.0 - Move slider to minimum (left)--value 0.5 - Move slider to middle--value 1.0 - Move slider to maximum (right)Pinches to zoom in or out on an element.
Tools/xcuitest-control pinch --scale 2.0 --target imageView
Tools/xcuitest-control pinch -s 0.5 -V 2.0
Options:
--scale, -s (required): Scale factor
< 1.0 = pinch in (zoom out)> 1.0 = pinch out (zoom in)--velocity, -V (optional): Speed in scale factor per second (default: 1.0)--target, -t (optional): Element to pinch. If omitted, pinches the app.Examples:
--scale 2.0 - Zoom in 2x--scale 0.5 - Zoom out to 50%--scale 1.5 --velocity 0.5 - Slow zoom inPauses for a specified duration.
Tools/xcuitest-control wait --value 2.0
Tools/xcuitest-control wait # defaults to 1.0 second
Options:
--value, -V (optional): Seconds to wait. Defaults to 1.0.Captures current state without performing any action.
Tools/xcuitest-control screenshot
Checks current command status without executing.
Tools/xcuitest-control status
Exits the test loop.
Tools/xcuitest-control done
When multiple elements share the same identifier (e.g., multiple "Edit" buttons in a list), the tap command:
--index: Automatically finds and taps the first hittable element--index N: Taps the element at the specified 0-based indexWhen a tap succeeds on one of multiple matches, the response includes info:
{
"status": "completed",
"info": "Tapped button at index 0 of 5 matches",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
When multiple elements match but none are hittable:
{
"status": "error",
"error": "Found 5 elements matching 'Edit', none were hittable. Specify --index 0 to 4 to select a specific element.",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
When the specified index exceeds available matches:
{
"status": "error",
"error": "Index 10 out of range. Found 5 'Edit' element(s). Use --index 0 to 4.",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
--index when you know which element you want (e.g., the second Edit button)info field to verify which element was tappedWhen interacting with text fields, the keyboard will appear and may block other UI elements.
Tap on a non-interactive element that's visible above the keyboard:
Tools/xcuitest-control tap --target notesLabel --target-type staticText
Tips for dismissing the keyboard:
StaticText elements (labels) that are above the keyboardTap the text field first to focus it:
Tools/xcuitest-control tap --target searchBar --target-type any
Then type your text:
Tools/xcuitest-control type --value "Hello"
| Issue | Solution |
|---|---|
| Keyboard blocking elements | Tap a non-interactive label above the keyboard to dismiss |
| Element not hittable | The element may be behind the keyboard - dismiss keyboard first |
| Can't scroll | Keyboard may be intercepting gestures - dismiss it first |
| Text not appearing | Ensure the text field was tapped/focused before typing |
The hierarchy file shows the element tree with types, identifiers, and labels:
Application, pid: 12345, label: 'MyApp'
Window, 0x600000001234
Other, identifier: 'mainView'
Button, identifier: 'settingsButton', label: 'Settings'
StaticText, identifier: 'welcomeLabel', label: 'Welcome!'
Cell, identifier: 'item_1', label: 'First Item'
Slider, identifier: 'volumeSlider', value: '50%'
From this hierarchy:
settingsButton is a Button → --target-type buttonwelcomeLabel is a StaticText → --target-type staticTextitem_1 is a Cell → --target-type cellvolumeSlider is a Slider → --target-type sliderUse --target-type any if unsure - it searches all element types.
The skill enforces several limits to prevent runaway behavior:
| Setting | Value | Purpose |
|---|---|---|
| Maximum actions | 100 | Prevents infinite loops |
| Command timeout | 5 minutes | XCUITest exits if no commands received |
| Element timeout | 10 seconds | Actions fail gracefully if element not found |
| Tap retry count | 3 | Retries if element exists but not hittable |
On error, the CLI returns:
{
"status": "error",
"error": "Element 'missingButton' not found after waiting 10 seconds",
"hierarchy": "/tmp/xcuitest-hierarchy.txt",
"screenshot": "/tmp/xcuitest-screenshot.png"
}
When this happens:
| Error | Solution |
|---|---|
| Element not found | Check hierarchy for correct identifier, try --target-type any |
| Element not hittable | Wait for animations, scroll element into view, retry |
| Multiple matches, none hittable | Use --index to select specific element, or scroll to reveal hittable ones |
| Index out of range | Check hierarchy to count matches, use valid index (0 to N-1) |
| Wrong element type | Use --target-type any or check hierarchy for actual type |
| Action limit reached | Break goal into smaller steps, restart skill |
| Test timeout | XCUITest exited due to 5 min inactivity, restart test |
Goal: Navigate to the Settings view and explore
# 1. Start the test
xcodebuild test \
-workspace YourApp.xcworkspace \
-scheme "YourUITestScheme" \
-destination 'platform=iOS Simulator,name=iPhone 16 Pro' \
-only-testing:"YourUITestTarget/InteractiveControlTests/testInteractiveControl" &
# 2. Wait for initialization
while [ ! -f /tmp/xcuitest-hierarchy.txt ]; do sleep 1; done
# 3. Read initial state
cat /tmp/xcuitest-hierarchy.txt
# 4. Read the screenshot to see the current view
# Use the Read tool on /tmp/xcuitest-screenshot.png
# 5. Tap an element based on what you see
Tools/xcuitest-control tap --target someButton --target-type button
# 6. Read updated hierarchy and screenshot
cat /tmp/xcuitest-hierarchy.txt
# 7. Exit when done
Tools/xcuitest-control done
The CLI supports environment variable overrides for file paths:
| Variable | Default | Description |
|---|---|---|
XCUITEST_COMMAND_PATH | /tmp/xcuitest-command.json | Path to command JSON file |
XCUITEST_HIERARCHY_PATH | /tmp/xcuitest-hierarchy.txt | Path to hierarchy output |
XCUITEST_SCREENSHOT_PATH | /tmp/xcuitest-screenshot.png | Path to screenshot output |
When using custom paths, also configure the Swift InteractiveControlLoop.Configuration to match:
let config = InteractiveControlLoop.Configuration(
commandPath: "/custom/path/command.json",
hierarchyPath: "/custom/path/hierarchy.txt",
screenshotPath: "/custom/path/screenshot.png"
)
InteractiveControlLoop(configuration: config).run(app: app)
For direct JSON manipulation, the CLI uses these files:
| File | Purpose |
|---|---|
/tmp/xcuitest-command.json | Commands from Claude → XCUITest |
/tmp/xcuitest-hierarchy.txt | UI hierarchy from XCUITest → Claude |
/tmp/xcuitest-screenshot.png | Screenshot from XCUITest → Claude |
{
"action": "tap" | "scroll" | "type" | "wait" | "screenshot" | "adjust" | "pinch" | "done",
"target": "elementIdentifier",
"targetType": "button" | "staticText" | "cell" | "textField" | "slider" | "any",
"index": 0,
"value": "text to type (for type) or 0.0-1.0 (for adjust)",
"direction": "up" | "down" | "left" | "right",
"scale": "pinch scale factor (< 1.0 = zoom out, > 1.0 = zoom in)",
"velocity": "pinch speed in scale factor per second",
"status": "pending" | "executing" | "completed" | "error",
"errorMessage": "optional error description",
"info": "optional diagnostic info (e.g., which index was tapped)"
}
anywait command if UI is animatingdone command when finished--target-type any if specific type fails