CLI ツール
Desktop Computer Automation
Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands.
Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack.
Triggers: open app, press key, desktop, computer, click on screen, type text, screenshot desktop,
launch application, switch window, desktop automation, control computer, mouse click, keyboard shortcut,
screen capture, find on screen, read screen, verify window, close app, minimize window, maximize window
Powered by Midscene.js (https://midscenejs.com)