Control Windows applications using UiAutomationGRPC.Server with the "See → Think → Act" loop for efficient LLM-driven UI automation.
This skill enables you to control Windows desktop applications through a gRPC-based automation server. It uses Approach 2: App Structure (LLM-Friendly) for efficient "See → Think → Act" loops.
localhost:50051)The server supports two security modes. You must configure grpccurl accordingly.
When the server is running with Security.Enabled: false (default for development):
# Use -plaintext flag for HTTP connections
grpccurl -plaintext -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
When the server is running with Security.Enabled: true:
# Use HTTPS (no -plaintext flag) + TLS certificate verification
grpccurl -cacert server.crt -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
# Or skip certificate verification (not recommended for production)
grpccurl -insecure -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
When the server has Security.TokenAuthEnabled: true, include the authorization header:
# Add -H for Authorization header with Bearer token
grpccurl -plaintext \
-H "Authorization: Bearer YOUR_SECRET_TOKEN" \
-d '{"app_name": "calc"}' \
localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
Combined Secure + Token Auth:
grpccurl -insecure \
-H "Authorization: Bearer YOUR_SECRET_TOKEN" \
-d '{"app_name": "calc"}' \
localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
Note: If you receive
Unauthenticatederrors, verify:
- The token matches one in the server's
Security.ValidTokensarray- The header format is exactly
Authorization: Bearer <token>
┌─────────────────────────────────────────────────────────┐
│ 1. SEE → GetAppStructure (get full UI as JSON) │
│ 2. THINK → Analyze JSON, find target element UniqId │
│ 3. ACT → PerformActionWithStructure (action + new UI)│
│ 4. REPEAT → Response includes updated UI, continue │
└─────────────────────────────────────────────────────────┘
Retrieve the complete UI tree of an application.
grpccurl -plaintext -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
Parameters:
| Parameter | Description |
|---|---|
app_name | Process name (e.g., "calc", "notepad") |
process_id | Alternative: use PID instead |
use_process_id | Set true to use PID lookup |
Returns: json_structure containing the UI tree.
{
"UniqId": "42,12345", // ← Use this for actions
"Name": "Five", // Display name
"UiAutomationId": "num5Button", // Stable identifier
"ControlType": "ControlType.Button",
"BoundingRectangle": "x,y,w,h",
"IsClickable": true,
"IsVisible": true,
"Children": [ ... ]
}
This is the key method for the loop - performs an action AND returns the updated UI structure.
grpccurl -plaintext -d '{"runtime_id": "YOUR_UNIQ_ID", "action": 9}' localhost:50051 UiAutomation.UiAutomationService/PerformActionWithStructure
Parameters:
| Parameter | Description |
|---|---|
runtime_id | The UniqId from the JSON |
action | Action code (see table below) |
arguments | Optional string array |
| Action | Code | Use Case | Arguments |
|---|---|---|---|
| INVOKE | 0 | Default action (buttons) | - |
| TOGGLE | 1 | Checkboxes, switches | - |
| SET_VALUE | 4 | Type text | ["text"] |
| SET_FOCUS | 5 | Focus element | - |
| MoveTo | 8 | Move mouse to element | - |
| LeftClick | 9 | Click (recommended) | - |
| RightClick | 10 | Right-click | - |
| DoubleClick | 17 | Double-click | - |
# 1. Open calculator (if not running)
grpccurl -plaintext -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/OpenApp
# 2. Get structure → Find "Nine" button UniqId
grpccurl -plaintext -d '{"app_name": "calc"}' localhost:50051 UiAutomation.UiAutomationService/GetAppStructure
# 3. Click "9" → Returns updated structure
grpccurl -plaintext -d '{"runtime_id": "42,xxx", "action": 9}' localhost:50051 UiAutomation.UiAutomationService/PerformActionWithStructure
# 4. Click "×" → Find multiply button, click it
grpccurl -plaintext -d '{"runtime_id": "42,yyy", "action": 9}' localhost:50051 UiAutomation.UiAutomationService/PerformActionWithStructure
# 5. Click "9" again
grpccurl -plaintext -d '{"runtime_id": "42,xxx", "action": 9}' localhost:50051 UiAutomation.UiAutomationService/PerformActionWithStructure
# 6. Click "=" → Get result from display element
grpccurl -plaintext -d '{"runtime_id": "42,zzz", "action": 9}' localhost:50051 UiAutomation.UiAutomationService/PerformActionWithStructure
| Method | Description |
|---|---|
OpenApp | Launch application by path/name |
CloseApp | Close by process name |
CloseAppByProcessId | Close by PID |
SendKeys | Send keyboard input |
TakeScreenshot | Capture window/element |
ClearCache | Clear element cache — all, by process_id, or by app_name |
Tip: The server now validates cached elements on every access and auto-re-finds dead elements. For dynamic UIs (blotters, live data),
GetAppStructureautomatically flushes stale cache before rebuilding. You can also callClearCachewithprocess_idorapp_nameto clear only a specific application's cache, or without arguments to clear everything.
| Issue | Solution |
|---|---|
| App not found | Ensure app is running; use OpenApp first |
| Element not found | UniqIds are runtime-specific; re-fetch structure |
| Access denied | Run server with Admin privileges |
| Server unreachable | Verify UiAutomationGRPC.Server is running on 50051 |
| SSL connection error | Server uses HTTP; add -plaintext flag to grpccurl |
| Certificate error | Use -insecure flag or provide valid -cacert |
Unauthenticated | Token auth enabled; add -H "Authorization: Bearer TOKEN" |
Invalid token | Verify token is in server's Security.ValidTokens list |