From c6a19d19411f82a8f600966d56616384a5419b9f Mon Sep 17 00:00:00 2001 From: factory-ain3sh Date: Wed, 10 Jun 2026 01:53:58 -0700 Subject: [PATCH] docs(cli): cover desktop-control driver on the droid-control page (CLI-900) factory-plugins#27 adds a fourth automation backend (native desktop GUI apps via trycua/cua cua-driver) to the droid-control plugin. Update the feature page to match: fourth driver card, atom count 10 -> 11, install prerequisites accordion (incl. Windows install.ps1 and macOS permissions grant), and desktop/cua-driver keywords for search. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> --- docs/cli/features/droid-control.mdx | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/cli/features/droid-control.mdx b/docs/cli/features/droid-control.mdx index c9d3e96bc..4ce31dcc1 100644 --- a/docs/cli/features/droid-control.mdx +++ b/docs/cli/features/droid-control.mdx @@ -2,7 +2,7 @@ title: Droid Control sidebarTitle: Droid Control description: Terminal, browser, and desktop automation. Record demos, verify behavior claims, and run QA flows. -keywords: ['droid-control', 'automation', 'demo', 'verify', 'qa-test', 'terminal', 'browser', 'recording', 'video', 'showcase', 'tuistory', 'remotion', 'tctl', 'plugin'] +keywords: ['droid-control', 'automation', 'demo', 'verify', 'qa-test', 'terminal', 'browser', 'desktop', 'desktop-control', 'cua-driver', 'computer-use', 'recording', 'video', 'showcase', 'tuistory', 'remotion', 'tctl', 'plugin'] --- Droid Control lets Droids *operate* software: launch apps, type commands, click buttons, record what happens, and produce polished video evidence of it. Built by Droids, for Droids. @@ -192,9 +192,9 @@ Every video below was planned, recorded, and rendered entirely by a Droid. ## Automation drivers -Droid Control supports three automation backends. The right one is selected automatically based on what you're targeting. +Droid Control supports four automation backends. The right one is selected automatically based on what you're targeting. - + **Virtual PTY automation.** Default for terminal work. Playwright-style CLI with asciinema recording and forced truecolor output. @@ -204,6 +204,9 @@ Droid Control supports three automation backends. The right one is selected auto **Web and Electron apps.** Playwright-backed CLI with Chrome DevTools Protocol support. Navigates pages, fills forms, clicks buttons, captures screenshots. + + **Native desktop apps.** Accessibility-tree snapshots, element or pixel actions, and per-window screenshots via [trycua/cua](https://github.com/trycua/cua) `cua-driver` -- all without stealing focus. macOS and Windows are production tier; Linux is pre-release. + ## Video rendering @@ -241,7 +244,7 @@ Demo and showcase videos are rendered with [Remotion](https://www.remotion.dev/) The plugin uses a composition architecture with three layers: - **Orchestrator** -- Routes each request through three independent lookups (target, stage, artifact) to determine which skills to load. -- **10 atom skills** -- Self-contained background knowledge loaded on demand, split into drivers, targets, stages, and polish. +- **11 atom skills** -- Self-contained background knowledge loaded on demand, split into drivers, targets, stages, and polish. - **3 commands** -- Parse arguments into commitments, then delegate to atoms via hybrid handoffs. Every workflow flows through **capture → compose → verify**. Commands declare *what* to produce; atoms own *how*. Skills chain through explicit handoffs rather than hardcoded pipelines, so the droid follows the flow naturally. @@ -268,6 +271,13 @@ Only install what you need for your use case. agent-browser install # downloads Chromium ``` + + ```bash + curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh | bash + cua-driver skills install # upstream skill pack (deep tool reference) + ``` + Windows installs via PowerShell: `irm https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.ps1 | iex`. macOS additionally needs `cua-driver permissions grant` (Accessibility + Screen Recording). + | Platform | Required tools | |----------|----------------|