Getting Started with Desktest

A guide to running your first automated desktop test.

1. Install

Pre-built binary (recommended):

curl -fsSL https://raw.githubusercontent.com/Edison-Watch/desktest/master/install.sh | sh

From source:

git clone https://github.com/Edison-Watch/desktest.git
cd desktest
make install_cli

Prerequisites:

Docker daemon running (Docker Desktop, OrbStack, Colima, etc.)
An LLM API key — set one of these environment variables:
- ANTHROPIC_API_KEY (for Anthropic/Claude models)
- OPENAI_API_KEY (for OpenAI models)
- OPENROUTER_API_KEY (for OpenRouter)

For macOS app testing, see docs/macos-support.md. For Windows app testing, see dev-docs/windows-ci-guide.md.

2. Verify your setup

desktest doctor

This checks that Docker is accessible, your API key is configured, and all dependencies are in place. Fix any issues it reports before continuing.

3. Run an example test

The examples/ directory contains ready-to-run task files. Start with the simplest one — a gedit text editing test:

desktest run examples/gedit-save.json --monitor

This will:

Pull/build the desktest Docker image
Start a container with an XFCE desktop
Deploy the test app (gedit with a text file)
Run the LLM-powered agent to complete the task
Evaluate whether the task succeeded

Open http://localhost:7860 in your browser to watch the agent interact with the desktop in real time.

4. Review the results

After the test completes, inspect what happened:

# View the full trajectory in the terminal
desktest logs desktest_artifacts/

# View a compact summary
desktest logs desktest_artifacts/ --brief

# View specific steps
desktest logs desktest_artifacts/ --steps 1-3

# Or open an interactive HTML viewer in your browser
desktest review desktest_artifacts/

The desktest_artifacts/ directory contains screenshots, accessibility tree snapshots, and the full trajectory log (trajectory.jsonl).

5. Write your own test

Create a task JSON file that describes what to test. Here's a minimal example:

{
  "schema_version": "1.0",
  "id": "my-first-test",
  "instruction": "Open the file /home/tester/notes.txt in gedit, type 'Hello from desktest', and save the file.",
  "app": {
    "type": "folder",
    "dir": "./my-app",
    "entrypoint": "start.sh"
  },
  "config": [
    {
      "type": "execute",
      "command": "echo 'initial content' > /home/tester/notes.txt"
    }
  ],
  "evaluator": {
    "mode": "programmatic",
    "metrics": [
      {
        "type": "command_output",
        "command": "cat /home/tester/notes.txt",
        "match_mode": "contains",
        "expected": "Hello from desktest"
      }
    ]
  },
  "timeout": 120
}

Key fields:

Field	Description
`instruction`	Natural language prompt telling the agent what to do
`app`	How to deploy your application (`folder`, `appimage`, `docker_image`, etc.)
`config`	Setup steps run before the agent starts (create files, install packages, etc.)
`evaluator`	How to check if the task succeeded (`programmatic`, `llm`, or `hybrid`)
`timeout`	Maximum seconds for the agent loop

Scaffold a new task file interactively:

desktest init my-test.json

Validate it without running:

desktest validate my-test.json

6. Create a config file

For repeated runs, create a config JSON file instead of passing flags every time:

{
  "api_key": "sk-your-key-here",
  "provider": "anthropic",
  "model": "claude-sonnet-4-5-20250929"
}

Then reference it:

desktest run my-test.json --config config.json

Or use environment variables — no config file needed:

export ANTHROPIC_API_KEY="sk-your-key-here"
desktest run my-test.json

7. Replay without LLM costs

Once you have a working test, convert the agent's trajectory into a deterministic replay script:

# Convert trajectory to a Python script
desktest codify desktest_artifacts/trajectory.jsonl --overwrite my-test.json

# Re-run deterministically (no LLM, no API costs)
desktest run my-test.json --replay

This is ideal for CI/CD — replay mode executes the exact same PyAutoGUI actions without calling any LLM API.

Next steps

Browse more examples in examples/
Run a test suite: desktest suite examples/
Try QA mode for bug hunting: desktest run task.json --qa
Debug interactively: desktest interactive task.json
Set up CI integration: docs/ci.md
Test Electron apps: examples/ELECTRON_QUICKSTART.md
Test macOS apps: docs/macos-support.md
Test Windows apps: dev-docs/windows-ci-guide.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Started with Desktest

1. Install

2. Verify your setup

3. Run an example test

4. Review the results

5. Write your own test

6. Create a config file

7. Replay without LLM costs

Next steps

FilesExpand file tree

GETTING_STARTED.md

Latest commit

History

GETTING_STARTED.md

File metadata and controls

Getting Started with Desktest

1. Install

2. Verify your setup

3. Run an example test

4. Review the results

5. Write your own test

6. Create a config file

7. Replay without LLM costs

Next steps