Skip to content

claw-use/claw-use-windows

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claw Use Windows

License: MIT Python 3.10+

Turn any Windows PC into a programmable device for AI agents.
No admin. No remote desktop. Install → run → connect. Pure HTTP API.

Protocol-compatible with claw-use-android.

Quick Start

Windows (server)

# Requires Python 3.10+
pip install -r requirements.txt
python -m cuw_server

The server prints your LAN IP and auth token on startup.

Agent side (Linux / macOS)

# Install CLI
cp cli/cuw /usr/local/bin/ && chmod +x /usr/local/bin/cuw

# Register the Windows machine
cuw add mypc 192.168.x.x <token>

# Use it
cuw screen -c           # read UI tree
cuw act '{"click": 3}'  # click ref 3
cuw screenshot          # take screenshot

API

All endpoints (except /ping) require X-Bridge-Token header.

Endpoint Method Description
/ping GET Health check (no auth)
/info GET Hostname, OS, screen, battery, CPU
/screen GET UI Automation tree with refs (?compact=true)
/screenshot GET JPEG screenshot (?quality=75&maxWidth=1280)
/act POST Unified action — click, tap, type, key, scroll, launch
/flow POST Scripted automation (wait+then steps, zero LLM)
/clipboard GET/POST Read/write clipboard
/apps GET List installed apps
/apps/launch POST Launch app by name or path
/files/list GET List directory
/files/read GET Read file content
/tts/speak POST Text-to-speech (Windows SAPI)

/act Examples

// Click by ref (from /screen)
{"click": 5}

// Click by text
{"click": "OK"}

// Type into focused field
{"type": "hello world"}

// Focus a field, then type
{"type": {"ref": 3, "text": "hello"}}

// Keyboard shortcut
{"hotkey": ["ctrl", "c"]}

// Scroll
{"scroll": "down"}

// Launch app
{"launch": "notepad"}

// Multiple actions in one call
{"click": 3, "type": "hello"}

How It Works

AI Agent (any framework)
    │ HTTP :7333
    ▼
┌──────────────────────────────────┐
│     Claw Use Windows (Python)    │
│                                  │
│  FastAPI + Uvicorn               │
│    → HTTP server + token auth    │
│                                  │
│  UI Automation (uiautomation)    │
│    → Read UI tree with refs      │
│    → InvokePattern, ValuePattern │
│                                  │
│  SendInput (ctypes)              │
│    → Mouse, keyboard, scroll     │
│    → Full Unicode / CJK support  │
│                                  │
│  mss + Pillow → Screenshots      │
│  Flow Engine → wait+then scripts │
└──────────────────────────────────┘

CUW uses the Windows UI Automation API to read screen content and interact with controls — the same API used by screen readers. Input simulation goes through SendInput for reliable keystroke and mouse delivery, including full Unicode (CJK) support.

Windows ↔ Android Mapping

Android (CUA) Windows (CUW)
back Escape
home Win+D (show desktop)
recents Alt+Tab
longpress Right-click
AccessibilityService UI Automation
/launch package Start-Process / startfile

Family

Platform Repo CLI Status
Android claw-use-android cua ✅ Released
Windows claw-use-windows cuw ✅ v0.1.0
macOS claw-use-mac cum 🔮 Planned
Linux claw-use-linux cul 🔮 Planned

License

MIT

About

Turn any Windows PC into a programmable device for AI agents. Pure HTTP API, protocol-compatible with claw-use-android.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors