Support open-source self-hosted deployments: pluggable execution backend + local database (converted to project)

This issue has been converted to a project [[Automations v5] Support open-source self-hosted deployments](https://linear.app/all-hands-ai/project/automations-v5-support-open-source-self-hosted-deployments-592998ff560c/overview)

---

## Goal

Make the automation engine usable by **open-source self-hosted users**, not just OpenHands Cloud. Today two things tie the codebase exclusively to Cloud infrastructure:

1. **Execution backend** — every run provisions a fresh Cloud sandbox. Self-hosted users need to point at their own persistent agent-server instead.
2. **Database** — the codebase requires PostgreSQL (Cloud SQL). Self-hosted users need a zero-setup local option (SQLite).

Cloud users should see **zero change** in behavior; the new paths are opt-in via config.

---

## Gap 1: Pluggable Execution Backend

Currently the engine creates a Cloud sandbox per run, waits for it, discovers the agent-server URL inside, runs the tarball, then deletes the sandbox. Both modes actually talk to the **same agent-server HTTP APIs** — the only difference is how you obtain the URL.


|  | Cloud mode (existing) | Agent-server mode (new) |
| -- | -- | -- |
| Get agent-server URL | Create sandbox → poll → extract from `exposed_urls` | Read from config (`AUTOMATION_AGENT_SERVER_URL`) |
| Upload tarball | Same | Same |
| Start entrypoint | Same | Same |
| Cleanup | Delete sandbox | Nothing (persistent server) |
| Auth | Mint per-user API key via service key | Use config-level key |

**Config surface**: `agent_server_url` and `agent_server_api_key` on `ServiceSettings`. Presence of `agent_server_url` is the mode flag — no separate boolean needed.

**Preset scripts** (`sdk_main.py`) need to detect `AGENT_SERVER_URL` env var and use `RemoteWorkspace` instead of `OpenHandsCloudWorkspace`.

<details><summary>Detailed codebase audit (per-file changes)</summary>

### `automation/execution.py` — main refactor target

Sandbox provisioning and agent-server interaction are currently interleaved. The refactor separates them.

**Cloud-mode-only functions** (sandbox lifecycle):


| Function | What it does |
| -- | -- |
| `_create_sandbox()` | `POST /api/v1/sandboxes` |
| `_poll_sandbox()` | `GET /api/v1/sandboxes?id=` |
| `_create_and_wait()` | Create + poll until RUNNING + extract agent-server URL |
| `_find_agent_server_url()` | Parse `exposed_urls` for `AGENT_SERVER` |

**Shared functions** (agent-server interaction, unchanged):


| Function | What it does |
| -- | -- |
| `_upload()` | `POST /api/file/upload/{path}` |
| `_bash()` | `POST /api/bash/execute_bash_command` |
| `_start_bash()` | `POST /api/bash/start_bash_command` |
| `_download_in_sandbox()` | curl download inside the runtime |
| `build_tarball()` | Build tarball in memory |

**Branching needed** in `dispatch_automation()` and `run_automation()`: Cloud → `_create_and_wait()`, agent-server → use config URL. Skip `delete_sandbox()` in agent-server mode.

**Env var injection differences:**


| Env var | Cloud mode | Agent-server mode |
| -- | -- | -- |
| `SANDBOX_ID` | From sandbox creation | Not applicable |
| `SESSION_API_KEY` | From sandbox response | From config |
| `OPENHANDS_API_KEY` | Per-user key via service key | May use config-level key |
| `OPENHANDS_CLOUD_API_URL` | Cloud API URL | May not be needed |

### `automation/utils/sandbox.py` — split

* `utils/sandbox.py` — keep as-is for Cloud mode
* `utils/agent_server.py` — new, shared agent-server queries (`get_last_bash_command_result`, `verify_run_status`)

### `automation/dispatcher.py`

* Pass execution mode to `dispatch_automation()`
* In agent-server mode, skip `get_api_key_for_automation_run()` and use config-level auth
* Store `command_id` from `start_bash_command` instead of `sandbox_id`

### `automation/watchdog.py`

* In agent-server mode, query configured URL directly instead of discovering sandbox
* Skip `cleanup_sandbox()` in agent-server mode

### `automation/models.py` / `schemas.py`

* Add nullable `command_id` column (backward-compatible)
* Add `command_id` to `AutomationRunResponse`

### `automation/router.py`

Likely zero changes — existing `if not run.keep_alive and run.sandbox_id` guard already skips cleanup when `sandbox_id` is None.

### Preset scripts (`sdk_main.py`)

Detect `AGENT_SERVER_URL` env var → use `RemoteWorkspace`; otherwise use `OpenHandsCloudWorkspace`.

### Unchanged files

`scheduler.py`, `event_router.py`, `webhook_router.py`, `trigger_matcher.py`, `filter_eval.py`, `event_schemas/`, `preset_router.py`, `uploads.py`, `storage/`, `db.py`, `logger.py`, `auth.py`, `utils/cron.py`, `utils/tarball_validation.py`, `utils/time.py`, `utils/run.py`, `exceptions.py`

</details>

---

## Gap 2: Local Database (SQLite)

The codebase requires PostgreSQL exclusively — it uses `JSONB` columns, `FOR UPDATE SKIP LOCKED` row locking, advisory locks in migrations, and hardcoded `asyncpg`/`pg8000` drivers.

For self-hosted deployments, support **SQLite** as a lightweight local alternative:

* New `AUTOMATION_DB_URL` config setting accepting `sqlite+aiosqlite:///` URLs
* Use SQLAlchemy's generic `JSON` type instead of PostgreSQL-specific `JSONB`
* Skip `FOR UPDATE SKIP LOCKED` on SQLite (not needed for single-process)
* Auto-create tables on startup for SQLite (bypassing PG-specific Alembic migrations)

SQLite is appropriate for local dev and small-scale deployments. PostgreSQL remains the default for production.

---

## Implementation Plan

1. **Extract shared agent-server module** — move shared functions out of `utils/sandbox.py`
2. **Add config settings** — `agent_server_url`, `agent_server_api_key`, `db_url`
3. **Branch **[**execution.py**](<http://execution.py>) — Cloud vs. agent-server mode in dispatch/run functions
4. **Branch watchdog & dispatcher** — mode-aware verification and cleanup
5. **Preset script dual-mode** — `RemoteWorkspace` vs. `OpenHandsCloudWorkspace`
6. **SQLite backend** — generic JSON types, conditional row locking, auto-create tables
7. **DB migration + tests** — add `command_id` column, test fixtures for both modes

---

## Open Questions

1. **Working directory isolation** — in agent-server mode, multiple runs share the filesystem. Use per-run dirs?
2. **Concurrency** — does the agent-server support multiple concurrent `start_bash_command` calls?
3. **LLM/Secrets/MCP config** — without Cloud API, how do preset scripts get LLM config? Env vars, config file, or hybrid mode?
4. **Cleanup between runs** — who removes temp files? Per-run working directories?
5. **Auth model** — does the persistent agent-server use `X-Session-API-Key`? Needs to be configurable.
6. **Hybrid mode** — agent-server execution + Cloud API for `get_llm()` / `get_secrets()`?

---

*This issue was created by an AI assistant (OpenHands) on behalf of @xingyaoww, based on a discussion about making the automation engine usable by open source self-hosted deployments alongside the existing Cloud sandbox mode.*

Function	What it does
`_create_sandbox()`	`POST /api/v1/sandboxes`
`_poll_sandbox()`	`GET /api/v1/sandboxes?id=`
`_create_and_wait()`	Create + poll until RUNNING + extract agent-server URL
`_find_agent_server_url()`	Parse `exposed_urls` for `AGENT_SERVER`

Function	What it does
`_upload()`	`POST /api/file/upload/{path}`
`_bash()`	`POST /api/bash/execute_bash_command`
`_start_bash()`	`POST /api/bash/start_bash_command`
`_download_in_sandbox()`	curl download inside the runtime
`build_tarball()`	Build tarball in memory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support open-source self-hosted deployments: pluggable execution backend + local database (converted to project) #62

Goal

Gap 1: Pluggable Execution Backend

`automation/execution.py` — main refactor target

`automation/utils/sandbox.py` — split

`automation/dispatcher.py`

`automation/watchdog.py`

`automation/models.py` / `schemas.py`

`automation/router.py`

Preset scripts (`sdk_main.py`)

Unchanged files

Gap 2: Local Database (SQLite)

Implementation Plan

Open Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	Cloud mode (existing)	Agent-server mode (new)
Get agent-server URL	Create sandbox → poll → extract from `exposed_urls`	Read from config (`AUTOMATION_AGENT_SERVER_URL`)
Upload tarball	Same	Same
Start entrypoint	Same	Same
Cleanup	Delete sandbox	Nothing (persistent server)
Auth	Mint per-user API key via service key	Use config-level key

Env var	Cloud mode	Agent-server mode
`SANDBOX_ID`	From sandbox creation	Not applicable
`SESSION_API_KEY`	From sandbox response	From config
`OPENHANDS_API_KEY`	Per-user key via service key	May use config-level key
`OPENHANDS_CLOUD_API_URL`	Cloud API URL	May not be needed

Support open-source self-hosted deployments: pluggable execution backend + local database (converted to project) #62

Description

Goal

Gap 1: Pluggable Execution Backend

automation/execution.py — main refactor target

automation/utils/sandbox.py — split

automation/dispatcher.py

automation/watchdog.py

automation/models.py / schemas.py

automation/router.py

Preset scripts (sdk_main.py)

Unchanged files

Gap 2: Local Database (SQLite)

Implementation Plan

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`automation/execution.py` — main refactor target

`automation/utils/sandbox.py` — split

`automation/dispatcher.py`

`automation/watchdog.py`

`automation/models.py` / `schemas.py`

`automation/router.py`

Preset scripts (`sdk_main.py`)