Control the downstream remote host connected to NanoKVM through the NanoKVM Bridge API when a PicoClaw session needs screenshots, mouse actions, keyboard input, hotkeys, scrolling, or drag operations. Invoke only ./scripts/nanokvm-bridge.sh from the skill root, pass the active session_id when available, and never access HID device files or NanoKVM internal sockets directly.
Use this skill when you need to inspect or operate the downstream remote host desktop that NanoKVM exposes over HDMI and USB HID. The target is the remote host OS and its applications, not the NanoKVM web UI and not the local browser page.
Fallback skill. Prefer the
kvm_screenshotandkvm_actionsMCP tools for remote GUI work. Use this shell-based skill only when MCP tools are unavailable.
./scripts/nanokvm-bridge.sh. Prefer ./scripts/nanokvm-bridge.sh ... over scripts/..., PATH lookup, or absolute-path probing so the tool stays inside the skill working directory./dev/hidg*, /tmp/hid*, /api/ws, or any NanoKVM internal HID files directly.--session-id, , , or . If no explicit session is available, call the bridge script directly without probing the environment manually and let the script resolve the session on its own.AI_SESSION_IDPICO_SESSION_IDSESSION_IDkvm-control skill root. Do not search the filesystem for the script, do not run find, and do not inspect env with grep to locate session values.[0,1] range for all mouse actions (click, move, drag).Use these templates for common sequences. Choose mouse or keyboard based on what is visible and simplest.
./scripts/nanokvm-bridge.sh click --x <X> --y <Y> --button left
./scripts/nanokvm-bridge.sh double-click --x <X> --y <Y>
./scripts/nanokvm-bridge.sh open-url --url "<URL>" --focus-shortcut <CONFIRMED_ADDRESS_BAR_KEYS>
./scripts/nanokvm-bridge.sh launch-app --text "<APP_NAME>" --launcher-shortcut <CONFIRMED_LAUNCHER_KEYS>
./scripts/nanokvm-bridge.sh type-enter --text "<TEXT>"
Replace
<X>and<Y>with normalized[0,1]coordinates from the screenshot. Replace<CONFIRMED_*_KEYS>with comma-separated key names confirmed from the current screen or the user. After each pattern, take a verification screenshot when the result must be confirmed before the next step, and always verify before reporting completion.
./scripts/nanokvm-bridge.sh screenshot --format base64
./scripts/nanokvm-bridge.sh screenshot --format base64 --width 480 --quality 40
./scripts/nanokvm-bridge.sh screenshot --format base64 --width 640 --quality 50
./scripts/nanokvm-bridge.sh click --x 0.42 --y 0.31 --button left
./scripts/nanokvm-bridge.sh double-click --x 0.42 --y 0.31
./scripts/nanokvm-bridge.sh move --x 0.42 --y 0.31
./scripts/nanokvm-bridge.sh type --text "hello world"
./scripts/nanokvm-bridge.sh type-enter --text "hello world"
./scripts/nanokvm-bridge.sh hotkey --keys <KEY1>,<KEY2>
./scripts/nanokvm-bridge.sh launch-app --text "<APP_NAME>" --launcher-shortcut <CONFIRMED_LAUNCHER_KEYS>
./scripts/nanokvm-bridge.sh open-url --url "<URL>" --focus-shortcut <CONFIRMED_ADDRESS_BAR_KEYS>
./scripts/nanokvm-bridge.sh scroll --direction down --amount 3
./scripts/nanokvm-bridge.sh drag --from-x 0.20 --from-y 0.40 --to-x 0.80 --to-y 0.40
./scripts/nanokvm-bridge.sh wait --duration-ms 800
./scripts/nanokvm-bridge.sh actions-json '{"actions":[{"action":"move","x":0.42,"y":0.31},{"action":"click","x":0.42,"y":0.31,"button":"left"}]}'
screenshot --format base64 returns the raw NanoKVM JSON response.screenshot captures the HDMI frame seen by NanoKVM, which should correspond to the downstream remote host.--width 480 --quality 40. Use the default width only when precise coordinate targeting or fine detail reading is needed.type is the safest way to enter normal text.hotkey --keys expects a comma-separated list. Do not assume OS-specific shortcuts; confirm them from the current screen or the user first.double-click for opening obvious targetstype-enter for search boxes, dialogs, and promptslaunch-app only after the launcher shortcut is confirmed from the current screen or the useropen-url only when a browser is already focused and the address bar shortcut is confirmed from the current screen or the useractions-json accepts a full NanoKVM request body. Use it for deterministic multi-step sequences to minimize round-trips.