Control Android devices via ADB commands - tap, swipe, type, navigate apps
This skill enables you to control Android devices connected via ADB (Android Debug Bridge). You act as both the reasoning and execution engine - reading the device's UI state directly and deciding what actions to take.
All scripts support the -s <serial> flag to target a specific device. This is essential when multiple devices are connected (e.g., a physical phone AND an emulator).
Run scripts/check-device.sh to see all connected devices:
Multiple devices connected (2):
[PHYSICAL] 1A051FDF6007PA - Pixel 6
[EMULATOR] emulator-5554 - sdk_gphone64_arm64
Use -s <serial> to specify which device to use.
When the user mentions:
[PHYSICAL] device[EMULATOR] deviceOnce you identify the target device, pass -s <serial> to ALL subsequent scripts:
# Check specific device
scripts/check-device.sh -s 1A051FDF6007PA
# All actions on that device
scripts/get-screen.sh -s 1A051FDF6007PA
scripts/tap.sh -s 1A051FDF6007PA 540 960
scripts/launch-app.sh -s 1A051FDF6007PA chrome
Important: Be consistent - use the same serial for all commands in a session.
When given a task, follow this perception-action loop:
scripts/check-device.sh first
scripts/get-screen.sh [-s serial] to dump UI hierarchy-s serial if neededThe get-screen.sh script outputs Android's accessibility XML. Key attributes to look for:
<node index="0" text="Settings" resource-id="com.android.settings:id/title"
class="android.widget.TextView" content-desc=""
bounds="[42,234][1038,345]" clickable="true" />
Important attributes:
text - Visible text on the elementcontent-desc - Accessibility description (useful for icons)resource-id - Unique identifier for the elementbounds - Screen coordinates as [left,top][right,bottom]clickable - Whether element responds to tapsscrollable - Whether element can be scrolledfocused - Whether element has input focusCalculating tap coordinates:
From bounds="[left,top][right,bottom]", calculate center:
Example: bounds="[42,234][1038,345]" → tap at x=540, y=289
All scripts are in the scripts/ directory. Run them via bash.
All scripts support -s <serial> to target a specific device.
| Script | Args | Description |
|---|---|---|
check-device.sh | [-s serial] | List devices / verify connection |
wake.sh | [-s serial] | Wake device and dismiss lock screen |
screenshot.sh | [-s serial] | Capture screen image |
| Script | Args | Description |
|---|---|---|
get-screen.sh | [-s serial] | Dump UI accessibility tree |
| Script | Args | Description |
|---|---|---|
tap.sh | [-s serial] x y | Tap at coordinates |
type-text.sh | [-s serial] "text" | Type text string |
swipe.sh | [-s serial] direction | Swipe up/down/left/right |
key.sh | [-s serial] keyname | Press key (home/back/enter/recent) |
| Script | Args | Description |
|---|---|---|
launch-app.sh | [-s serial] package_or_name | Launch app by package or search by name |
install-apk.sh | [-s serial] path/to/file.apk | Install APK to device |
clickable="true"focused="true" or class="android.widget.EditText"up (scroll down), down (scroll up), left, righthome - Return to home screenback - Go back / close dialogsenter - Submit forms / confirmrecent - Open recent appsget-screen.sh returns empty or minimal XML# By package name (fastest)
scripts/launch-app.sh com.android.chrome
# By app name (searches installed apps)
scripts/launch-app.sh "Chrome"
scripts/get-screen.shscripts/tap.sh 540 289scripts/type-text.sh "your text here"scripts/key.sh enterscripts/swipe.sh upscripts/key.sh backadb devices manually to troubleshootUser request: "Open Chrome and search for weather"
1. scripts/check-device.sh
→ Device connected: Pixel 6
→ Serial: 1A051FDF6007PA
→ Type: Physical
2. scripts/launch-app.sh com.android.chrome
→ Chrome launched
3. scripts/get-screen.sh
→ [Read XML, find search/URL bar]
→ Found: bounds="[0,141][1080,228]" resource-id="com.android.chrome:id/url_bar"
→ Center: x=540, y=184
4. scripts/tap.sh 540 184
→ Tapped URL bar
5. scripts/get-screen.sh
→ [Verify keyboard appeared and field is focused]
6. scripts/type-text.sh "weather"
→ Typed "weather"
7. scripts/key.sh enter
→ Pressed enter to search
8. scripts/get-screen.sh
→ [Verify search results loaded]
→ Task complete!
User request: "Open Settings on my phone" (with emulator also running)
1. scripts/check-device.sh
→ Multiple devices connected (2):
→ [PHYSICAL] 1A051FDF6007PA - Pixel 6
→ [EMULATOR] emulator-5554 - sdk_gphone64_arm64
User said "my phone" → target the PHYSICAL device
Serial to use: 1A051FDF6007PA
2. scripts/check-device.sh -s 1A051FDF6007PA
→ Device connected: Pixel 6
→ Serial: 1A051FDF6007PA
→ Type: Physical
→ Status: Ready
3. scripts/launch-app.sh -s 1A051FDF6007PA settings
→ Resolved 'settings' to package: com.android.settings
→ Launched: com.android.settings
4. scripts/get-screen.sh -s 1A051FDF6007PA
→ [Read XML, verify Settings app is open]
→ Task complete!
-s <serial> on ALL commands