mOSdat

Multi-OS Desktop App Testing Framework

Automated testing infrastructure using Proxmox VMs to validate desktop applications across Linux distributions, Windows desktops, display servers, and GPU configurations.

Supersedes the archived electron-linux-testing Vagrant prototype (Jan 2026).

Install

pip install -e .
mosdat --help

Typical local development:

python -m pytest -q
mosdat validate examples/rocketchat.toml
mosdat list-vms examples/rocketchat.toml

Run via Docker

A Dockerfile is provided for containerized execution:

# Build locally
docker build -t mosdat:dev .

# Run help
docker run --rm mosdat:dev

# Run with a config file
docker run --rm -v $(pwd)/myconfig.toml:/app/myconfig.toml mosdat:dev \
  functional /app/myconfig.toml --vms ubuntu2404

To use Docker images from the registry (when published):

# Pull from registry (forward-looking; not yet published)
docker pull ghcr.io/jeanfbrito/mosdat:latest

# Run the smoke test scenario
docker run --rm ghcr.io/jeanfbrito/mosdat:latest \
  functional examples/rocketchat.toml --vms ubuntu2404 --test rocketchat-smoke-linux

Overview

Testing desktop apps properly requires real environments — different distros, display servers, Windows releases, and GPU configurations. Containers can't do this. Manual testing doesn't scale.

mOSdat uses Proxmox to orchestrate VMs, drive real desktops over VNC/SSH, pass through NVIDIA GPUs via VFIO when needed, and collect reproducible artifacts for triage.

┌─────────────────────────────────────────────────────────────────────────┐
│                              mOSdat                                     │
│                                                                         │
│   ┌─────────┐    ┌──────────────┐    ┌─────────────────────────────┐   │
│   │ mosdat  │───▶│   Proxmox    │───▶│         Test VMs            │   │
│   │   CLI   │    │  Orchestrator│    │  ┌───────┐  ┌───────┐       │   │
│   └─────────┘    └──────────────┘    │  │Fedora │  │Ubuntu │  ...  │   │
│                         │            │  │+GPU   │  │+GPU   │       │   │
│                         │            │  │+Wayland│ │+X11   │       │   │
│                         ▼            │  └───────┘  └───────┘       │   │
│                  ┌──────────────┐    └─────────────────────────────┘   │
│                  │   Results    │                   │                  │
│                  │    Report    │◀──────────────────┘                  │
│                  └──────────────┘                                      │
└─────────────────────────────────────────────────────────────────────────┘

Features

GPU Passthrough — Real NVIDIA GPUs via VFIO, not emulated

Display Server Matrix — Native Wayland, X11, XWayland, and misconfigured environments

Linux + Windows VMs — Shared scenario runner for Linux desktops plus Windows 10/11 functional coverage

Full Pipeline — Build from git ref → deploy to VM → run tests → collect results

Accessibility-first UI Automation — Use AT-SPI role/name targeting on Linux when available, with VLM localization as fallback

VLM Functional Testing — Drive real desktops through Proxmox VNC, with VLM localize/verify steps that work across X11, Wayland, and Windows

Live Triage Dashboard — Watch current and historical functional runs, stale/dead runs, failures, screenshots, and step timelines from a LAN web UI

Author Workbench + Agent API — Create reusable VLM test flows from a browser or via mosdat author, including manual coordinate picking, hover, left/right click, type, key, wait, shell, launch, draft-step JSON editing, validation, and YAML export

Preflight, Replay, Doctor — Validate scenario/VM readiness, replay cached VLM checks, and diagnose VM health without rerunning a full matrix

Reproducible — Same VM snapshot, same test sequence, consistent results

Common Workflows

Recommended authoring workflow: Author routines first (shared/routines/), then compose scenarios that call them. See docs/AUTO-AUTHORING.md.

Run a functional VLM smoke test:

mosdat functional examples/rocketchat.toml --vms ubuntu2404 --test rocketchat-smoke-linux

Build a Rocket.Chat Electron PR, deploy it, and verify the tested app contains the expected symbol:

mosdat build --pr 3325 --target deb --deploy ubuntu2204,ubuntu2404 \
  --verify-symbol isTelephonyEnabled

Preflight a functional scenario before spending VM/VLM time:

mosdat preflight examples/rocketchat.toml \
  --vms ubuntu2404 \
  --test rocketchat-smoke-linux

Inspect the live Linux accessibility tree for semantic selectors:

mosdat atspi-dump examples/rocketchat.toml --vms ubuntu2404 --format tree

Diagnose VM and host health:

mosdat doctor examples/rocketchat.toml --vms ubuntu2404

Run a recorded functional session replay (change-filtered frames, smaller artifact size):

mosdat functional examples/rocketchat.toml \
  --vms windows11 \
  --test rocketchat-smoke \
  --record-fps 10 \
  --record-gif

Recording is on by default. Use --no-record-session to opt out.

Serve the live dashboard and authoring workbench:

mosdat live --port 8082 --results results --config examples/rocketchat.toml

Open:

Runs dashboard: http://<host>:8082/
Author Workbench: http://<host>:8082/author
Recording artifacts: open from the run cards or under http://<host>:8082/artifact/...

Use the agent authoring API through the CLI:

mosdat author --url http://127.0.0.1:8082 vms
mosdat author --url http://127.0.0.1:8082 doctor
# doctor includes a non-blocking verify_model_configured warning when yes/no checks reuse the localize model
mosdat author --url http://127.0.0.1:8082 start --vm ubuntu2404
mosdat author --url http://127.0.0.1:8082 capture --session <session-id> --output /tmp/screen.bmp
mosdat author --url http://127.0.0.1:8082 localize --session <session-id> --prompt "help tooltip"
mosdat author --url http://127.0.0.1:8082 describe --session <session-id> --x 120 --y 240
mosdat author --url http://127.0.0.1:8082 click --session <session-id> --x 5 --y 6 --prompt "help tooltip"
mosdat author --url http://127.0.0.1:8082 prompt-click --session <session-id> --prompt "help tooltip"
mosdat author --url http://127.0.0.1:8082 prompt-hover --session <session-id> --prompt "help tooltip"
mosdat author --url http://127.0.0.1:8082 prompt-type --session <session-id> --prompt "message box" --text "hello"
mosdat author --url http://127.0.0.1:8082 type --session <session-id> --text "hello"
mosdat author --url http://127.0.0.1:8082 key --session <session-id> --key enter
mosdat author --url http://127.0.0.1:8082 validate --session <session-id>
mosdat author --url http://127.0.0.1:8082 export --session <session-id> --name tooltip-flow
mosdat author --url http://127.0.0.1:8082 export --session <session-id> --name tooltip-flow --output shared/scenarios/functional/tooltip-flow.yaml
mosdat author --url http://127.0.0.1:8082 step --session <session-id> --json '{"key":"escape"}'
mosdat author --url http://127.0.0.1:8082 step --session <session-id> --steps-json '[{"key":"escape"},{"wait":1}]'
mosdat author --url http://127.0.0.1:8082 close --session <session-id>

Generate the static historical dashboard:

mosdat dashboard --root results --output results/functional/dashboard.html

Replay a cached VLM verification against an existing result directory:

mosdat replay results/functional/<run-dir>/<vm> --step 5

Results

Validated a Wayland compatibility fix for Rocket.Chat Desktop:

Scenario	Before Fix	After Fix
Real Wayland session	PASS	PASS
Fake Wayland socket	SEGFAULT	PASS
Missing display variable	SEGFAULT	PASS
X11 fallback	SEGFAULT	PASS

Historical GPU Passthrough Test Results

Real hardware validation with NVIDIA RTX 3060 via VFIO:

OS	gpu-wayland-real	gpu-wayland-fake	gpu-x11	gpu-wayland-nodisp
Fedora 42	PASS	PASS	PASS	PASS
Ubuntu 22.04	SKIP (X11 default)	PASS	PASS	PASS
Ubuntu 24.04	PASS	PASS	PASS	PASS
openSUSE Leap 16.0	SKIP (X11 default)	PASS	PASS	N/A
Manjaro Linux 26.0.1	PASS	PASS	PASS	N/A

See Test Matrix and Case Studies for details.

Tested Platforms

Platform	Desktop	Package formats	Scenario coverage	Status
Fedora 42	GNOME (Wayland)	RPM, AppImage, Flatpak	Smoke + TEL QA	Complete
Ubuntu 22.04 LTS	GNOME (X11)	DEB, AppImage, Flatpak, Snap	Smoke + TEL QA	Complete
Ubuntu 24.04 LTS	GNOME (Wayland)	DEB, AppImage, Flatpak, Snap	Smoke + TEL QA	Complete
openSUSE Leap 16.0	KDE (X11)	RPM, AppImage, Flatpak	Smoke + TEL QA	Complete
Manjaro Linux 26.0.1	KDE (Wayland)	AppImage, Flatpak	Smoke + TEL QA	Complete
Windows 10	Windows desktop	EXE	Smoke + TEL QA	Configured
Windows 11	Windows desktop	EXE	Smoke + TEL QA	Configured

Notes:

All 5 target distributions fully tested with real GPU passthrough
openSUSE using nouveau driver (open source) with software rendering
Manjaro running latest kernel (6.18) with KDE Plasma on Wayland
Windows 10/11 VMs are configured for functional scenario coverage and EXE install flows

See Linux Coverage Strategy for why these distributions were selected.

Documentation

Document	Description
Architecture	System design
Hardware	Test environment specs
Linux Coverage	Distribution selection strategy
Test Matrix	Test results by OS
Proxmox Setup	VFIO and GPU passthrough
Case Studies	Test examples
Functional Linux Tests	Linux AT-SPI selectors, VNC input, and VLM verification
AT-SPI Authoring	Accessibility-first Linux selector workflow
Reusable Routines	Shared scenario routine library
Live Dashboard	Real-time triage dashboard and Author Workbench
Matrix Run	Current matrix execution runbook
Agent Monitoring	Long-running run monitoring patterns
Visual Regression	Screenshot reference capture/check workflow
Completion Criteria	Done criteria for OS/package/GPU coverage
Triage	Failure triage and exit-code interpretation
Auto-Authoring	Generate functional test YAMLs from code changes via `mosdat draft`
Issue Confirmation	GitHub issue confirmation workflow
Troubleshooting	Common issues

Built With

Proxmox VE — VM orchestration
VFIO/IOMMU — GPU passthrough
Python 3.11+ — CLI, runner, and scenario orchestration
opencode + oh-my-opencode

Name		Name	Last commit message	Last commit date
Latest commit History 266 Commits
.claude-plugin		.claude-plugin
.github		.github
.knowledge		.knowledge
automation		automation
docs		docs
examples		examples
os		os
results		results
scripts		scripts
shared		shared
skills		skills
tests		tests
tools		tools
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
lessons.md		lessons.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements-lock.txt		requirements-lock.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mOSdat

Multi-OS Desktop App Testing Framework

Install

Run via Docker

Overview

Features

Common Workflows

Results

Historical GPU Passthrough Test Results

Tested Platforms

Documentation

Built With

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mOSdat

Multi-OS Desktop App Testing Framework

Install

Run via Docker

Overview

Features

Common Workflows

Results

Historical GPU Passthrough Test Results

Tested Platforms

Documentation

Built With

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages