This issue has been converted to a project [Automations v5] Support open-source self-hosted deployments
Goal
Make the automation engine usable by open-source self-hosted users, not just OpenHands Cloud. Today two things tie the codebase exclusively to Cloud infrastructure:
- Execution backend — every run provisions a fresh Cloud sandbox. Self-hosted users need to point at their own persistent agent-server instead.
- Database — the codebase requires PostgreSQL (Cloud SQL). Self-hosted users need a zero-setup local option (SQLite).
Cloud users should see zero change in behavior; the new paths are opt-in via config.
Gap 1: Pluggable Execution Backend
Currently the engine creates a Cloud sandbox per run, waits for it, discovers the agent-server URL inside, runs the tarball, then deletes the sandbox. Both modes actually talk to the same agent-server HTTP APIs — the only difference is how you obtain the URL.
|
Cloud mode (existing) |
Agent-server mode (new) |
| Get agent-server URL |
Create sandbox → poll → extract from exposed_urls |
Read from config (AUTOMATION_AGENT_SERVER_URL) |
| Upload tarball |
Same |
Same |
| Start entrypoint |
Same |
Same |
| Cleanup |
Delete sandbox |
Nothing (persistent server) |
| Auth |
Mint per-user API key via service key |
Use config-level key |
Config surface: agent_server_url and agent_server_api_key on ServiceSettings. Presence of agent_server_url is the mode flag — no separate boolean needed.
Preset scripts (sdk_main.py) need to detect AGENT_SERVER_URL env var and use RemoteWorkspace instead of OpenHandsCloudWorkspace.
Detailed codebase audit (per-file changes)
automation/execution.py — main refactor target
Sandbox provisioning and agent-server interaction are currently interleaved. The refactor separates them.
Cloud-mode-only functions (sandbox lifecycle):
| Function |
What it does |
_create_sandbox() |
POST /api/v1/sandboxes |
_poll_sandbox() |
GET /api/v1/sandboxes?id= |
_create_and_wait() |
Create + poll until RUNNING + extract agent-server URL |
_find_agent_server_url() |
Parse exposed_urls for AGENT_SERVER |
Shared functions (agent-server interaction, unchanged):
| Function |
What it does |
_upload() |
POST /api/file/upload/{path} |
_bash() |
POST /api/bash/execute_bash_command |
_start_bash() |
POST /api/bash/start_bash_command |
_download_in_sandbox() |
curl download inside the runtime |
build_tarball() |
Build tarball in memory |
Branching needed in dispatch_automation() and run_automation(): Cloud → _create_and_wait(), agent-server → use config URL. Skip delete_sandbox() in agent-server mode.
Env var injection differences:
| Env var |
Cloud mode |
Agent-server mode |
SANDBOX_ID |
From sandbox creation |
Not applicable |
SESSION_API_KEY |
From sandbox response |
From config |
OPENHANDS_API_KEY |
Per-user key via service key |
May use config-level key |
OPENHANDS_CLOUD_API_URL |
Cloud API URL |
May not be needed |
automation/utils/sandbox.py — split
utils/sandbox.py — keep as-is for Cloud mode
utils/agent_server.py — new, shared agent-server queries (get_last_bash_command_result, verify_run_status)
automation/dispatcher.py
- Pass execution mode to
dispatch_automation()
- In agent-server mode, skip
get_api_key_for_automation_run() and use config-level auth
- Store
command_id from start_bash_command instead of sandbox_id
automation/watchdog.py
- In agent-server mode, query configured URL directly instead of discovering sandbox
- Skip
cleanup_sandbox() in agent-server mode
automation/models.py / schemas.py
- Add nullable
command_id column (backward-compatible)
- Add
command_id to AutomationRunResponse
automation/router.py
Likely zero changes — existing if not run.keep_alive and run.sandbox_id guard already skips cleanup when sandbox_id is None.
Preset scripts (sdk_main.py)
Detect AGENT_SERVER_URL env var → use RemoteWorkspace; otherwise use OpenHandsCloudWorkspace.
Unchanged files
scheduler.py, event_router.py, webhook_router.py, trigger_matcher.py, filter_eval.py, event_schemas/, preset_router.py, uploads.py, storage/, db.py, logger.py, auth.py, utils/cron.py, utils/tarball_validation.py, utils/time.py, utils/run.py, exceptions.py
Gap 2: Local Database (SQLite)
The codebase requires PostgreSQL exclusively — it uses JSONB columns, FOR UPDATE SKIP LOCKED row locking, advisory locks in migrations, and hardcoded asyncpg/pg8000 drivers.
For self-hosted deployments, support SQLite as a lightweight local alternative:
- New
AUTOMATION_DB_URL config setting accepting sqlite+aiosqlite:/// URLs
- Use SQLAlchemy's generic
JSON type instead of PostgreSQL-specific JSONB
- Skip
FOR UPDATE SKIP LOCKED on SQLite (not needed for single-process)
- Auto-create tables on startup for SQLite (bypassing PG-specific Alembic migrations)
SQLite is appropriate for local dev and small-scale deployments. PostgreSQL remains the default for production.
Implementation Plan
- Extract shared agent-server module — move shared functions out of
utils/sandbox.py
- Add config settings —
agent_server_url, agent_server_api_key, db_url
- **Branch **execution.py — Cloud vs. agent-server mode in dispatch/run functions
- Branch watchdog & dispatcher — mode-aware verification and cleanup
- Preset script dual-mode —
RemoteWorkspace vs. OpenHandsCloudWorkspace
- SQLite backend — generic JSON types, conditional row locking, auto-create tables
- DB migration + tests — add
command_id column, test fixtures for both modes
Open Questions
- Working directory isolation — in agent-server mode, multiple runs share the filesystem. Use per-run dirs?
- Concurrency — does the agent-server support multiple concurrent
start_bash_command calls?
- LLM/Secrets/MCP config — without Cloud API, how do preset scripts get LLM config? Env vars, config file, or hybrid mode?
- Cleanup between runs — who removes temp files? Per-run working directories?
- Auth model — does the persistent agent-server use
X-Session-API-Key? Needs to be configurable.
- Hybrid mode — agent-server execution + Cloud API for
get_llm() / get_secrets()?
This issue was created by an AI assistant (OpenHands) on behalf of @xingyaoww, based on a discussion about making the automation engine usable by open source self-hosted deployments alongside the existing Cloud sandbox mode.
This issue has been converted to a project [Automations v5] Support open-source self-hosted deployments
Goal
Make the automation engine usable by open-source self-hosted users, not just OpenHands Cloud. Today two things tie the codebase exclusively to Cloud infrastructure:
Cloud users should see zero change in behavior; the new paths are opt-in via config.
Gap 1: Pluggable Execution Backend
Currently the engine creates a Cloud sandbox per run, waits for it, discovers the agent-server URL inside, runs the tarball, then deletes the sandbox. Both modes actually talk to the same agent-server HTTP APIs — the only difference is how you obtain the URL.
exposed_urlsAUTOMATION_AGENT_SERVER_URL)Config surface:
agent_server_urlandagent_server_api_keyonServiceSettings. Presence ofagent_server_urlis the mode flag — no separate boolean needed.Preset scripts (
sdk_main.py) need to detectAGENT_SERVER_URLenv var and useRemoteWorkspaceinstead ofOpenHandsCloudWorkspace.Detailed codebase audit (per-file changes)
automation/execution.py— main refactor targetSandbox provisioning and agent-server interaction are currently interleaved. The refactor separates them.
Cloud-mode-only functions (sandbox lifecycle):
_create_sandbox()POST /api/v1/sandboxes_poll_sandbox()GET /api/v1/sandboxes?id=_create_and_wait()_find_agent_server_url()exposed_urlsforAGENT_SERVERShared functions (agent-server interaction, unchanged):
_upload()POST /api/file/upload/{path}_bash()POST /api/bash/execute_bash_command_start_bash()POST /api/bash/start_bash_command_download_in_sandbox()build_tarball()Branching needed in
dispatch_automation()andrun_automation(): Cloud →_create_and_wait(), agent-server → use config URL. Skipdelete_sandbox()in agent-server mode.Env var injection differences:
SANDBOX_IDSESSION_API_KEYOPENHANDS_API_KEYOPENHANDS_CLOUD_API_URLautomation/utils/sandbox.py— splitutils/sandbox.py— keep as-is for Cloud modeutils/agent_server.py— new, shared agent-server queries (get_last_bash_command_result,verify_run_status)automation/dispatcher.pydispatch_automation()get_api_key_for_automation_run()and use config-level authcommand_idfromstart_bash_commandinstead ofsandbox_idautomation/watchdog.pycleanup_sandbox()in agent-server modeautomation/models.py/schemas.pycommand_idcolumn (backward-compatible)command_idtoAutomationRunResponseautomation/router.pyLikely zero changes — existing
if not run.keep_alive and run.sandbox_idguard already skips cleanup whensandbox_idis None.Preset scripts (
sdk_main.py)Detect
AGENT_SERVER_URLenv var → useRemoteWorkspace; otherwise useOpenHandsCloudWorkspace.Unchanged files
scheduler.py,event_router.py,webhook_router.py,trigger_matcher.py,filter_eval.py,event_schemas/,preset_router.py,uploads.py,storage/,db.py,logger.py,auth.py,utils/cron.py,utils/tarball_validation.py,utils/time.py,utils/run.py,exceptions.pyGap 2: Local Database (SQLite)
The codebase requires PostgreSQL exclusively — it uses
JSONBcolumns,FOR UPDATE SKIP LOCKEDrow locking, advisory locks in migrations, and hardcodedasyncpg/pg8000drivers.For self-hosted deployments, support SQLite as a lightweight local alternative:
AUTOMATION_DB_URLconfig setting acceptingsqlite+aiosqlite:///URLsJSONtype instead of PostgreSQL-specificJSONBFOR UPDATE SKIP LOCKEDon SQLite (not needed for single-process)SQLite is appropriate for local dev and small-scale deployments. PostgreSQL remains the default for production.
Implementation Plan
utils/sandbox.pyagent_server_url,agent_server_api_key,db_urlRemoteWorkspacevs.OpenHandsCloudWorkspacecommand_idcolumn, test fixtures for both modesOpen Questions
start_bash_commandcalls?X-Session-API-Key? Needs to be configurable.get_llm()/get_secrets()?This issue was created by an AI assistant (OpenHands) on behalf of @xingyaoww, based on a discussion about making the automation engine usable by open source self-hosted deployments alongside the existing Cloud sandbox mode.