branch: project-sandbox-deps
Spec: Project-Level Sandbox Dependencies
Overview
Allow projects to declare additional packages or custom Dockerfile layers that get installed into the sandbox image. Dependencies are declared in a .agent-loop/ directory in the project root (not the worktree), so they are shared and cached across all worktrees for the same project.
Two modes are supported, with a clear priority:
Dockerfile.sandbox (full control) — takes precedence if present
dependencies (simple package list) — auto-generates a Dockerfile layer
Both use ARG BASE_IMAGE / FROM ${BASE_IMAGE} so they are agent-agnostic — ralph injects the correct base image via --build-arg at build time. The same project config works regardless of which agent is used.
Project-level images are tagged as agent-loop-sandbox-{agent}-{project}:v{hash} where {project} is derived from the repo directory name and {hash} is an 8-character content hash of the base image tag + project file content.
Additionally, the existing base image content hash is shortened from 12 to 8 characters for readability.
Architecture
Image layering:
docker/sandbox-templates:claude-code (upstream base)
│
▼
docker/agent-loop/claude/Dockerfile (agent base — build-essential, jq, etc.)
│
▼ (only if .agent-loop/ config exists in project)
.agent-loop/Dockerfile.sandbox (custom project layer)
OR
.agent-loop/dependencies (auto-generated project layer)
│
▼
agent-loop-sandbox-claude-myproject:v{hash} (final image used by sandbox)
Lookup flow in ensure_sandbox():
1. Build/cache base image (existing logic, unchanged)
2. Resolve project_dir = repo root (git rev-parse --show-toplevel)
3. Check {project_dir}/.agent-loop/Dockerfile.sandbox → if exists, use it
4. Else check {project_dir}/.agent-loop/dependencies → generate Dockerfile
5. Else → use base image directly (no project layer)
6. Content hash = SHA256(base_tag + project_file_content)[:8]
7. If image tag exists → reuse cached; else build with --build-arg BASE_IMAGE=<base_tag>
File locations:
<project-root>/
.agent-loop/
dependencies # Option A: one apt package per line, # comments
Dockerfile.sandbox # Option B: full Dockerfile (takes precedence)
1. Content Hash Length
Shorten the content hash from 12 to 8 characters throughout. Applies to both base and project image tags.
Current: agent-loop-sandbox-claude:v3a8f2b1c9d0e
New: agent-loop-sandbox-claude:v3a8f2b1c
2. Dependencies File Format
Plain text list of apt packages:
# .agent-loop/dependencies
# One package per line. Comments and blank lines are ignored.
openjdk-21-jdk
python3-venv
nodejs
Parsing rules:
- Strip
# comments (including inline # comment after a package name)
- Strip leading/trailing whitespace
- Skip empty lines
- Remaining lines are package names passed to
apt-get install -y --no-install-recommends
3. Dockerfile.sandbox Format
A standard Dockerfile using ARG BASE_IMAGE:
ARG BASE_IMAGE
FROM ${BASE_IMAGE}
USER root
RUN apt-get update && apt-get install -y --no-install-recommends openjdk-21-jdk \
&& rm -rf /var/lib/apt/lists/*
USER agent
Ralph builds with: docker build --build-arg BASE_IMAGE=<base_tag> -t <project_tag> -f Dockerfile.sandbox <context_dir>
The build context is the .agent-loop/ directory, so files within it can be COPY'd into the image if needed.
4. Generated Dockerfile (from dependencies)
When dependencies exists but Dockerfile.sandbox does not, ralph generates a Dockerfile in memory:
ARG BASE_IMAGE
FROM ${BASE_IMAGE}
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
pkg1 pkg2 pkg3 \
&& rm -rf /var/lib/apt/lists/*
USER agent
Written to a temp file only during docker build, then cleaned up.
5. Project Image Tagging
Tag format: agent-loop-sandbox-{agent}-{project}:v{hash}
{agent} — agent name (e.g., claude)
{project} — os.path.basename(project_dir) (e.g., elasticsearch)
{hash} — SHA256(base_tag + project_dockerfile_content)[:8]
The hash incorporates the base image tag (not just digest), so a base image rebuild cascades to project image rebuilds.
6. --rebuild Flag
--rebuild forces everything: re-pull upstream base, rebuild agent base image, rebuild project layer. No separate flag needed — content-addressed caching handles dependency file changes automatically.
7. Selftest Enhancement
When ralph selftest is run from a directory containing .agent-loop/dependencies or .agent-loop/Dockerfile.sandbox, add a check that verifies the project image builds successfully. Report as "build project image" with the project tag.
Implementation Plan
Each step follows this structure:
- Implement — Write the code
- Test — Write pytest tests
- Verify — Run tests, fix failures until all pass
- Review — Code review for bugs, edge cases, and conventions
- Address feedback — Fix review findings, re-run tests, re-review until clean
- Update spec — Mark the step
[done] and record any decisions or deviations
Spec maintenance rules
- Mark each step
[done] when complete.
- Record design decisions that emerged during implementation as notes under the step.
- Minor deviations (e.g. method name changes, reordered logic) should be noted and the spec updated to match.
- Significant design changes (e.g. new subcommands, changed architecture, removed features) require pausing for user review before proceeding.
Step 1: Shorten content hash to 8 characters [done]
Files:
scripts/ralph — Sandbox.content_hash() method
tests/test_ralph.py — TestSandboxContentHash, TestSandboxEnsureImage, and any other tests asserting tag format
Implement:
- In
Sandbox.content_hash(), change [:12] to [:8]
Test:
- Update
TestSandboxContentHash assertions for 8-char hashes
- Update any
TestSandboxEnsureImage assertions that check tag length
- Search for any other tests asserting 12-char hashes and update them
Verify: Run pytest tests/test_ralph.py -v. Fix any failures and re-run until all pass.
Review: Ensure no hardcoded 12-char hash assumptions remain anywhere.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 2: Add dependencies file parsing [done]
Files:
scripts/ralph — New static method Sandbox.parse_dependencies(content)
tests/test_ralph.py — New TestSandboxParseDependencies class
Implement:
- Add
Sandbox.parse_dependencies(content) static method that:
- Splits content into lines
- Strips inline
# comments (everything from # to end of line)
- Strips whitespace
- Skips empty lines
- Returns a list of package names
Test:
- Basic package list parsing
- Comment-only lines skipped
- Inline comments stripped (
pkg # comment → pkg)
- Blank lines skipped
- Whitespace handling (leading/trailing)
- Empty content returns empty list
Verify: Run pytest tests/test_ralph.py::TestSandboxParseDependencies -v.
Review: Edge cases in comment parsing.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 3: Add project Dockerfile generation [done]
Files:
scripts/ralph — New static method Sandbox.generate_project_dockerfile(packages)
tests/test_ralph.py — New TestSandboxGenerateProjectDockerfile class
Implement:
- Add
Sandbox.generate_project_dockerfile(packages) static method that:
- Takes a list of package names
- Returns a Dockerfile string with
ARG BASE_IMAGE, FROM ${BASE_IMAGE}, USER root, apt-get update && install, USER agent
- Joins packages with spaces in a single
apt-get install line
Test:
- Single package generates correct Dockerfile
- Multiple packages joined correctly
- Output contains
ARG BASE_IMAGE and FROM ${BASE_IMAGE}
- Output switches to
USER root then back to USER agent
- Output includes
--no-install-recommends and rm -rf /var/lib/apt/lists/*
Verify: Run pytest tests/test_ralph.py::TestSandboxGenerateProjectDockerfile -v.
Review: Dockerfile best practices (layer ordering, cleanup).
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 4: Add project image detection and building [done]
Files:
scripts/ralph — New methods: Sandbox.find_project_config(project_dir), Sandbox.project_image_tag(agent, project_name, base_tag, config_content), Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild)
tests/test_ralph.py — New test classes for each method
Implement:
Sandbox.find_project_config(project_dir) — static method that checks for .agent-loop/Dockerfile.sandbox then .agent-loop/dependencies. Returns a tuple (type, path) where type is "dockerfile", "dependencies", or None if neither exists.
Sandbox.project_image_tag(agent, project_name, base_tag, config_content) — static method that computes content hash from base_tag + config_content and returns agent-loop-sandbox-{agent}-{project_name}:v{hash}.
Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild=False):
- Calls
find_project_config(project_dir) — if None, returns base_tag
- Reads the config file content
- If type is
"dependencies", parses packages and generates Dockerfile content
- If type is
"dockerfile", reads the file content directly
- Computes project tag via
project_image_tag()
- If tag exists locally and not
force_rebuild, return cached tag
- Otherwise: for
"dockerfile", build with .agent-loop/ as build context and -f Dockerfile.sandbox; for "dependencies", write generated Dockerfile to a temp file and build with temp dir as context
- Always pass
--build-arg BASE_IMAGE=<base_tag>
- Return the project tag
Test:
find_project_config: prefers Dockerfile.sandbox over dependencies, returns None when neither exists, returns correct type and path for each
project_image_tag: includes project name and agent in tag, hash is 8 chars, different content produces different hash, same content produces same hash
ensure_project_image: returns base tag when no project config, builds when tag missing, skips build when tag exists, force_rebuild forces build, passes --build-arg BASE_IMAGE correctly, uses .agent-loop/ as build context for Dockerfile.sandbox, uses -f Dockerfile.sandbox for custom Dockerfiles
Verify: Run pytest tests/test_ralph.py -v -k "project".
Review: Temp file cleanup, build context correctness, content hash determinism.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 5: Wire project images into ensure_sandbox and process_issue [done]
Files:
scripts/ralph — Sandbox.ensure_sandbox(), process_issue(), main()
tests/test_ralph.py — Update TestSandboxEnsureSandbox, TestProcessIssueSandbox, TestMainSandboxFlags
Implement:
- Update
Sandbox.ensure_sandbox() signature to accept project_dir=None and force_rebuild=False
- Inside
ensure_sandbox(): after ensure_image(agent) returns base_tag, call ensure_project_image(agent, base_tag, project_dir, force_rebuild) to get final_tag. Use final_tag for sandbox creation.
- In
process_issue(), resolve repo_root via git.output("rev-parse", "--show-toplevel") and pass it to ensure_sandbox() as project_dir. Pass force_rebuild through from caller.
- Remove the early
sandbox.ensure_image(agent, force_rebuild=rebuild) call from main() (line 1721) since image building now happens inside ensure_sandbox() which has the project context.
- Pass
force_rebuild=rebuild through process_issue() and poll_loop() to ensure_sandbox().
Test:
TestSandboxEnsureSandbox: verify ensure_project_image is called with correct args, final tag used for sandbox creation
TestProcessIssueSandbox: verify repo_root resolved and passed through
TestMainSandboxFlags: verify --rebuild still forces rebuild of both layers
Verify: Run pytest tests/test_ralph.py -v. Fix any failures.
Review: Ensure no code path bypasses project image resolution. Verify --rebuild cascades correctly through both layers.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 6: Enhance selftest for project images [done]
Files:
scripts/ralph — selftest() function
tests/test_ralph.py — TestSelftest class
Implement:
- After the existing "build image" check in
selftest(), detect if the current directory has .agent-loop/ config via Sandbox.find_project_config(os.getcwd())
- If project config exists, add a check that builds the project image: call
ensure_project_image(agent, base_tag, os.getcwd())
- Report as
"build project image" with the project tag on success
Test:
- selftest reports project image check when
.agent-loop/dependencies exists in cwd
- selftest reports project image check when
.agent-loop/Dockerfile.sandbox exists in cwd
- selftest skips project image check when no
.agent-loop/ directory in cwd
- selftest reports failure when project image build fails
Verify: Run pytest tests/test_ralph.py::TestSelftest -v.
Review: Ensure selftest cleanup handles project images correctly.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 7: Run all checks [done]
Implement:
- Run the full test suite:
pytest tests/test_ralph.py -v
- Run
python3 -c "import py_compile; py_compile.compile('scripts/ralph', doraise=True)" to verify syntax
- Fix any failures
Verify: All checks pass clean.
Step 8: Create commit
Implement:
- Stage all changes and create a commit with a descriptive message summarizing the feature.
Verify: git log -1 shows the commit.
Conventions
- Language: Python 3 (stdlib only, no third-party dependencies)
- Tests: pytest with
unittest.mock.patch, tmp_path fixture for filesystem isolation
- Error messages: Prefix with
ralph:
- Exit codes: 0=success, 1=runtime error
- Imports: Use
from conftest import import_script in tests to import the extensionless ralph script
- Docker: Content-addressed image tags,
subprocess.run for all Docker CLI calls
branch: project-sandbox-deps
Spec: Project-Level Sandbox Dependencies
Overview
Allow projects to declare additional packages or custom Dockerfile layers that get installed into the sandbox image. Dependencies are declared in a
.agent-loop/directory in the project root (not the worktree), so they are shared and cached across all worktrees for the same project.Two modes are supported, with a clear priority:
Dockerfile.sandbox(full control) — takes precedence if presentdependencies(simple package list) — auto-generates a Dockerfile layerBoth use
ARG BASE_IMAGE/FROM ${BASE_IMAGE}so they are agent-agnostic — ralph injects the correct base image via--build-argat build time. The same project config works regardless of which agent is used.Project-level images are tagged as
agent-loop-sandbox-{agent}-{project}:v{hash}where{project}is derived from the repo directory name and{hash}is an 8-character content hash of the base image tag + project file content.Additionally, the existing base image content hash is shortened from 12 to 8 characters for readability.
Architecture
1. Content Hash Length
Shorten the content hash from 12 to 8 characters throughout. Applies to both base and project image tags.
Current:
agent-loop-sandbox-claude:v3a8f2b1c9d0eNew:
agent-loop-sandbox-claude:v3a8f2b1c2. Dependencies File Format
Plain text list of apt packages:
Parsing rules:
#comments (including inline# commentafter a package name)apt-get install -y --no-install-recommends3. Dockerfile.sandbox Format
A standard Dockerfile using
ARG BASE_IMAGE:Ralph builds with:
docker build --build-arg BASE_IMAGE=<base_tag> -t <project_tag> -f Dockerfile.sandbox <context_dir>The build context is the
.agent-loop/directory, so files within it can beCOPY'd into the image if needed.4. Generated Dockerfile (from dependencies)
When
dependenciesexists butDockerfile.sandboxdoes not, ralph generates a Dockerfile in memory:Written to a temp file only during
docker build, then cleaned up.5. Project Image Tagging
Tag format:
agent-loop-sandbox-{agent}-{project}:v{hash}{agent}— agent name (e.g.,claude){project}—os.path.basename(project_dir)(e.g.,elasticsearch){hash}—SHA256(base_tag + project_dockerfile_content)[:8]The hash incorporates the base image tag (not just digest), so a base image rebuild cascades to project image rebuilds.
6. --rebuild Flag
--rebuildforces everything: re-pull upstream base, rebuild agent base image, rebuild project layer. No separate flag needed — content-addressed caching handles dependency file changes automatically.7. Selftest Enhancement
When
ralph selftestis run from a directory containing.agent-loop/dependenciesor.agent-loop/Dockerfile.sandbox, add a check that verifies the project image builds successfully. Report as"build project image"with the project tag.Implementation Plan
Each step follows this structure:
[done]and record any decisions or deviationsSpec maintenance rules
[done]when complete.Step 1: Shorten content hash to 8 characters [done]
Files:
scripts/ralph—Sandbox.content_hash()methodtests/test_ralph.py—TestSandboxContentHash,TestSandboxEnsureImage, and any other tests asserting tag formatImplement:
Sandbox.content_hash(), change[:12]to[:8]Test:
TestSandboxContentHashassertions for 8-char hashesTestSandboxEnsureImageassertions that check tag lengthVerify: Run
pytest tests/test_ralph.py -v. Fix any failures and re-run until all pass.Review: Ensure no hardcoded 12-char hash assumptions remain anywhere.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 2: Add dependencies file parsing [done]
Files:
scripts/ralph— New static methodSandbox.parse_dependencies(content)tests/test_ralph.py— NewTestSandboxParseDependenciesclassImplement:
Sandbox.parse_dependencies(content)static method that:#comments (everything from#to end of line)Test:
pkg # comment→pkg)Verify: Run
pytest tests/test_ralph.py::TestSandboxParseDependencies -v.Review: Edge cases in comment parsing.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 3: Add project Dockerfile generation [done]
Files:
scripts/ralph— New static methodSandbox.generate_project_dockerfile(packages)tests/test_ralph.py— NewTestSandboxGenerateProjectDockerfileclassImplement:
Sandbox.generate_project_dockerfile(packages)static method that:ARG BASE_IMAGE,FROM ${BASE_IMAGE},USER root,apt-get update && install,USER agentapt-get installlineTest:
ARG BASE_IMAGEandFROM ${BASE_IMAGE}USER rootthen back toUSER agent--no-install-recommendsandrm -rf /var/lib/apt/lists/*Verify: Run
pytest tests/test_ralph.py::TestSandboxGenerateProjectDockerfile -v.Review: Dockerfile best practices (layer ordering, cleanup).
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 4: Add project image detection and building [done]
Files:
scripts/ralph— New methods:Sandbox.find_project_config(project_dir),Sandbox.project_image_tag(agent, project_name, base_tag, config_content),Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild)tests/test_ralph.py— New test classes for each methodImplement:
Sandbox.find_project_config(project_dir)— static method that checks for.agent-loop/Dockerfile.sandboxthen.agent-loop/dependencies. Returns a tuple(type, path)where type is"dockerfile","dependencies", orNoneif neither exists.Sandbox.project_image_tag(agent, project_name, base_tag, config_content)— static method that computes content hash frombase_tag + config_contentand returnsagent-loop-sandbox-{agent}-{project_name}:v{hash}.Sandbox.ensure_project_image(agent, base_tag, project_dir, force_rebuild=False):find_project_config(project_dir)— ifNone, returnsbase_tag"dependencies", parses packages and generates Dockerfile content"dockerfile", reads the file content directlyproject_image_tag()force_rebuild, return cached tag"dockerfile", build with.agent-loop/as build context and-f Dockerfile.sandbox; for"dependencies", write generated Dockerfile to a temp file and build with temp dir as context--build-arg BASE_IMAGE=<base_tag>Test:
find_project_config: prefersDockerfile.sandboxoverdependencies, returnsNonewhen neither exists, returns correct type and path for eachproject_image_tag: includes project name and agent in tag, hash is 8 chars, different content produces different hash, same content produces same hashensure_project_image: returns base tag when no project config, builds when tag missing, skips build when tag exists,force_rebuildforces build, passes--build-arg BASE_IMAGEcorrectly, uses.agent-loop/as build context forDockerfile.sandbox, uses-f Dockerfile.sandboxfor custom DockerfilesVerify: Run
pytest tests/test_ralph.py -v -k "project".Review: Temp file cleanup, build context correctness, content hash determinism.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 5: Wire project images into ensure_sandbox and process_issue [done]
Files:
scripts/ralph—Sandbox.ensure_sandbox(),process_issue(),main()tests/test_ralph.py— UpdateTestSandboxEnsureSandbox,TestProcessIssueSandbox,TestMainSandboxFlagsImplement:
Sandbox.ensure_sandbox()signature to acceptproject_dir=Noneandforce_rebuild=Falseensure_sandbox(): afterensure_image(agent)returnsbase_tag, callensure_project_image(agent, base_tag, project_dir, force_rebuild)to getfinal_tag. Usefinal_tagfor sandbox creation.process_issue(), resolverepo_rootviagit.output("rev-parse", "--show-toplevel")and pass it toensure_sandbox()asproject_dir. Passforce_rebuildthrough from caller.sandbox.ensure_image(agent, force_rebuild=rebuild)call frommain()(line 1721) since image building now happens insideensure_sandbox()which has the project context.force_rebuild=rebuildthroughprocess_issue()andpoll_loop()toensure_sandbox().Test:
TestSandboxEnsureSandbox: verifyensure_project_imageis called with correct args, final tag used for sandbox creationTestProcessIssueSandbox: verifyrepo_rootresolved and passed throughTestMainSandboxFlags: verify--rebuildstill forces rebuild of both layersVerify: Run
pytest tests/test_ralph.py -v. Fix any failures.Review: Ensure no code path bypasses project image resolution. Verify
--rebuildcascades correctly through both layers.Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 6: Enhance selftest for project images [done]
Files:
scripts/ralph—selftest()functiontests/test_ralph.py—TestSelftestclassImplement:
selftest(), detect if the current directory has.agent-loop/config viaSandbox.find_project_config(os.getcwd())ensure_project_image(agent, base_tag, os.getcwd())"build project image"with the project tag on successTest:
.agent-loop/dependenciesexists in cwd.agent-loop/Dockerfile.sandboxexists in cwd.agent-loop/directory in cwdVerify: Run
pytest tests/test_ralph.py::TestSelftest -v.Review: Ensure selftest cleanup handles project images correctly.
Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.
Step 7: Run all checks [done]
Implement:
pytest tests/test_ralph.py -vpython3 -c "import py_compile; py_compile.compile('scripts/ralph', doraise=True)"to verify syntaxVerify: All checks pass clean.
Step 8: Create commit
Implement:
Verify:
git log -1shows the commit.Conventions
unittest.mock.patch,tmp_pathfixture for filesystem isolationralph:from conftest import import_scriptin tests to import the extensionlessralphscriptsubprocess.runfor all Docker CLI calls