Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
246 changes: 242 additions & 4 deletions docs/mkdocs/en/code_executor.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ When this feature is enabled, if the LLM returns text containing code snippets,

## Code Executor Types

Two types of code executors are currently available:
Three types of code executors are currently available:

### UnsafeLocalCodeExecutor

Expand Down Expand Up @@ -34,6 +34,21 @@ Two types of code executors are currently available:
- Scenarios requiring execution of untrusted code
- Scenarios requiring environment isolation

### CubeCodeExecutor

**Features:**
- Agent dispatches code snippets to a remote Cube/E2B sandbox for execution; supports `Python/Bash`
- Strong sandboxed environment running on a remote host, suitable for executing untrusted code at scale
- Decoupled lifecycle: the same sandbox can be re-attached across processes via `sandbox_id` (`create` / `attach` / `create_or_recreate` factories)
- Ships an optional `CubeWorkspaceRuntime` that adds per-execution workspace directories, file upload/download (single files or whole directories via tar), and structured program runs — useful for the Skill subsystem
- Requires the optional `[cube]` extra (`pip install 'trpc-agent-py[cube]'`, which installs `e2b-code-interpreter`) and access to a Cube/E2B-compatible gateway

**Use Cases:**
- Production environments where Docker is not available on the agent host
- Scenarios requiring strong remote isolation for untrusted code
- Long-lived skill/code execution that needs a persistent workspace surviving across multiple `execute_code` calls
- Multi-tenant agent platforms that share a remote sandbox fleet

## Usage Examples

When creating an LlmAgent, build a CodeExecutor and configure the `code_executor` parameter to enable code execution functionality.
Expand All @@ -48,15 +63,19 @@ from trpc_agent_sdk.models import OpenAIModel
from trpc_agent_sdk.code_executors import BaseCodeExecutor
from trpc_agent_sdk.code_executors import UnsafeLocalCodeExecutor
from trpc_agent_sdk.code_executors import ContainerCodeExecutor
# Cube is an optional extra (`pip install 'trpc-agent-py[cube]'`)
from trpc_agent_sdk.code_executors.cube import CubeCodeExecutor
from trpc_agent_sdk.code_executors.cube import CubeCodeExecutorConfig
from trpc_agent_sdk.log import logger

def _create_code_executor(code_executor_type: str = "unsafe_local") -> BaseCodeExecutor:
async def _create_code_executor(code_executor_type: str = "unsafe_local") -> BaseCodeExecutor:
"""Create a code executor.

Args:
code_executor_type: Type of code executor to use. Options:
- "unsafe_local": Use UnsafeLocalCodeExecutor (default, no Docker required)
- "container": Use ContainerCodeExecutor (requires Docker)
- "cube": Use CubeCodeExecutor (requires the [cube] extra and a Cube/E2B gateway)
- None: Auto-detect from environment variable CODE_EXECUTOR_TYPE,
or default to "unsafe_local"

Expand All @@ -76,9 +95,18 @@ def _create_code_executor(code_executor_type: str = "unsafe_local") -> BaseCodeE
executor = ContainerCodeExecutor(image="python:3-slim", error_retry_attempts=1)
logger.info("ContainerCodeExecutor initialized successfully")
return executor
elif code_executor_type == "cube":
# CubeCodeExecutor reads E2B_API_URL / E2B_API_KEY / CUBE_TEMPLATE_ID
# from the environment when the corresponding cfg fields are unset.
# `create()` opens a fresh remote sandbox; pass `sandbox_id=...` in
# the cfg to attach to an existing one instead.
cfg = CubeCodeExecutorConfig(execute_timeout=30.0, idle_timeout=600)
executor = await CubeCodeExecutor.create(cfg)
logger.info("CubeCodeExecutor initialized: sandbox_id=%s", executor.sandbox_id)
return executor
else:
raise ValueError(f"Invalid code executor type: {code_executor_type}. "
"Valid options are: 'unsafe_local', 'container'")
"Valid options are: 'unsafe_local', 'container', 'cube'")

```

Expand Down Expand Up @@ -154,6 +182,68 @@ def create_agent() -> LlmAgent:
![ContainerCodeExecutor Execution Result](../assets/imgs/container0.png)
![ContainerCodeExecutor Execution Result 1](../assets/imgs/container1.png)

### Using CubeCodeExecutor

```python
# ...
async def create_agent() -> LlmAgent:
"""Create an agent backed by a remote Cube/E2B sandbox.

Required environment (read by CubeCodeExecutorConfig.resolve_*):
- E2B_API_URL: Cube/E2B-compatible gateway URL
- E2B_API_KEY: API key for the gateway
- CUBE_TEMPLATE_ID: Cube template id (e.g. `std-XXXXXXXX`)

Note: `_create_code_executor` is async because `CubeCodeExecutor.create`
opens the remote sandbox over the network. The executor owns the
sandbox; call `await executor.destroy()` when the agent shuts down to
free the remote resource. `executor.close()` only drops the local
handle and lets the sandbox idle out on its own.
"""
# Select cube
executor = await _create_code_executor(code_executor_type="cube")
agent = LlmAgent(
name="code_assistant",
description="Code execution assistant",
model=_create_model(), # You can change this to your preferred model
instruction=INSTRUCTION,
code_executor=executor, # Enables code execution functionality
)
return agent

# Install the optional extra before use:
# pip install 'trpc-agent-py[cube]'
# And export the gateway credentials:
# export E2B_API_URL=...
# export E2B_API_KEY=...
# export CUBE_TEMPLATE_ID=...
```

#### Attaching to an existing sandbox

`CubeCodeExecutor` exposes three async factories so callers can choose the
lifecycle policy explicitly. All three read the bound sandbox id from
`cfg.sandbox_id` so it is the single source of truth:

```python
# 1. Strict create-or-attach: when cfg.sandbox_id is set, attach and assert
# the sandbox is RUNNING; otherwise create a fresh one.
executor = await CubeCodeExecutor.create(cfg)

# 2. Attach-only: requires cfg.sandbox_id to be set; never creates fresh.
executor = await CubeCodeExecutor.attach(cfg)

# 3. Attach-or-recreate: invokes `on_recreate` when the sandbox is gone,
# then transparently provisions a new one. Useful for long-lived agents
# whose external locator state must be cleared on recreate.
executor = await CubeCodeExecutor.create_or_recreate(
cfg, on_recreate=lambda old_id: clear_locator(old_id),
)
```

`close()` is a no-op for the remote sandbox (it just drops the local
handle); `destroy()` explicitly kills the remote sandbox.

## Configuration Parameters

### UnsafeLocalCodeExecutor Parameters
Expand Down Expand Up @@ -208,6 +298,117 @@ code_executor = ContainerCodeExecutor(
)
```

### CubeCodeExecutor Parameters

`CubeCodeExecutor` is configured via two dataclasses split by ISP:
`CubeCodeExecutorConfig` carries only sandbox-lifecycle / command-execution
settings, and `CubeWorkspaceRuntimeConfig` carries only workspace settings
(see the next section).

```python
from trpc_agent_sdk.code_executors.cube import (
CubeCodeExecutor,
CubeCodeExecutorConfig,
)

cfg = CubeCodeExecutorConfig(
# Cube template id for new sandboxes; falls back to env CUBE_TEMPLATE_ID.
template=None,

# E2B-compatible Cube API URL; falls back to env E2B_API_URL.
api_url=None,

# E2B API key; falls back to env E2B_API_KEY.
api_key=None,

# Existing remote sandbox id. When set, factories attach instead of
# creating a fresh sandbox.
sandbox_id=None,

# Default per-command timeout in seconds (float). Shared by the bare
# executor and the workspace runtime. Default: 60.0.
execute_timeout=60.0,

# Sandbox idle lifetime in seconds (int >= 1); renewed on every
# command. Default: 3600 (1 hour). The underlying e2b API takes
# integer seconds — sub-second values are rejected at construction.
idle_timeout=3600,
)

executor = await CubeCodeExecutor.create(cfg)
```

`CubeCodeExecutor` accepts the same `code_block_delimiters` as the other
executors; by default it adds a `bash` delimiter on top of the default
`python` and `tool_code` delimiters so plain `\`\`\`bash\n ... \n\`\`\``
fences are also picked up.

## CubeWorkspaceRuntime

For skill execution and other use cases that need a per-execution
workspace (input staging, structured program runs, output collection),
the Cube package additionally ships `CubeWorkspaceRuntime`. It composes
`CubeWorkspaceManager` (workspace directory lifecycle), `CubeWorkspaceFS`
(file/directory upload, download and glob-based collection), and
`CubeProgramRunner` (structured `cmd` + `args` execution) on top of the
same `CubeSandboxClient`.

```python
from trpc_agent_sdk.code_executors._types import (
WorkspaceOutputSpec,
WorkspacePutFileInfo,
WorkspaceRunProgramSpec,
)
from trpc_agent_sdk.code_executors.cube import (
CubeCodeExecutor,
CubeCodeExecutorConfig,
CubeWorkspaceRuntimeConfig,
create_cube_workspace_runtime,
)

executor = await CubeCodeExecutor.create(CubeCodeExecutorConfig())

# `workspace_cfg` is optional. When omitted the runtime uses
# DEFAULT_REMOTE_WORKSPACE = "/workspace/cube_agent" as the root.
runtime = create_cube_workspace_runtime(
executor,
workspace_cfg=CubeWorkspaceRuntimeConfig(
# Remote root under which the manager creates per-execution
# `ws_<exec_id>_<suffix>` subtrees.
remote_workspace="/workspace/cube_agent",
),
)

manager = runtime.manager()
fs = runtime.fs()
runner = runtime.runner()

ws = await manager.create_workspace("demo-1") # /workspace/cube_agent/ws_demo-1_<ts>

await fs.put_files(ws, [
WorkspacePutFileInfo(path="work/script.py",
content=b"print('script ran')\n"),
])

run_result = await runner.run_program(
ws,
WorkspaceRunProgramSpec(cmd="python3", args=["work/script.py"], timeout=15.0),
)
print(run_result.exit_code, run_result.stdout)

outputs = await fs.collect_outputs(
ws, WorkspaceOutputSpec(globs=["work/*.py"], inline=True),
)
for ref in outputs.files:
print(ref.name, len(ref.content))

await manager.cleanup("demo-1")
```

The runtime plugs straight into the Skill subsystem — pass it as
`workspace_runtime` when constructing a skill repository (see
[skill.md](skill.md) for details).

## Code Block Format

The Agent automatically identifies and executes code blocks in LLM responses. Supported code block formats:
Expand Down Expand Up @@ -245,6 +446,10 @@ After code execution, the results are returned to the LLM in the following forma
- Python (`python`, `py`, `python3`, empty string defaults to Python)
- Bash (`bash`, `sh`)

### CubeCodeExecutor
- Python (`python`, `py`, `python3`, empty string defaults to Python)
- Bash (`bash`, `sh`)

## Workflow

1. **User Query** → Agent receives the user query
Expand Down Expand Up @@ -291,10 +496,42 @@ code_executor = UnsafeLocalCodeExecutor(timeout=30) # 30-second timeout
- Review the log output; the framework logs detailed error information
- For ContainerCodeExecutor, check the container logs

### 4. CubeCodeExecutor Cannot Connect / Authenticates as Wrong Tenant

**Problem:** `CubeCodeExecutor.create` raises with messages like
`Cube sandbox requires \`api_url\` or E2B_API_URL env`, `... api_key ...`,
or `... template ... CUBE_TEMPLATE_ID ...`.

**Solution:**
- Install the optional extra: `pip install 'trpc-agent-py[cube]'`
- Export the three required env vars (or pass them on
`CubeCodeExecutorConfig`): `E2B_API_URL`, `E2B_API_KEY`, `CUBE_TEMPLATE_ID`
- For multi-tenant deployments, prefer setting the cfg fields explicitly so
each agent instance uses its own credentials instead of falling back to
the process-wide environment

### 5. CubeCodeExecutor Sandbox Disappears Between Calls

**Problem:** A sandbox attached via `cfg.sandbox_id` raises
`SandboxNotFoundException` (gone) or `SandboxException` (PAUSED) on the
next command.

**Solution:**
- For long-lived agents, use `CubeCodeExecutor.create_or_recreate(cfg, on_recreate=...)`
so the executor transparently provisions a new sandbox and notifies the
caller to clear any external locator state
- Tune `idle_timeout` (default 3600s) upward if you legitimately need a
longer idle window between commands; every command renews the lease
- Use `CubeWorkspaceManager.cleanup(exec_id)` instead of `executor.destroy()`
if you only want to drop one workspace while keeping the sandbox alive

## Complete Example

See the complete example code: [examples/code_executors/agent/agent.py](../../../examples/code_executors/agent/agent.py)

End-to-end Cube example (executor + workspace runtime):
[examples/code_executors/cube_demo.py](../../../examples/code_executors/cube_demo.py)

## Security Recommendations

1. **Production Environment**: It is strongly recommended to use `ContainerCodeExecutor` for sandbox isolation
Expand All @@ -307,4 +544,5 @@ See the complete example code: [examples/code_executors/agent/agent.py](../../..

- **UnsafeLocalCodeExecutor**: Fast execution speed, suitable for rapid iteration
- **ContainerCodeExecutor**: The initial startup requires pulling the image; subsequent executions are relatively fast
- It is recommended to use ContainerCodeExecutor in production environments and UnsafeLocalCodeExecutor in development environments
- **CubeCodeExecutor**: Adds network round-trips to a remote sandbox per command, but amortizes well for long-lived sessions because the sandbox is reused across calls (and across processes via `sandbox_id`); workspace file transfers use a tar-based protocol so directory uploads/downloads stay a single round-trip
- It is recommended to use ContainerCodeExecutor or CubeCodeExecutor in production environments and UnsafeLocalCodeExecutor in development environments
21 changes: 18 additions & 3 deletions docs/mkdocs/en/skill.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,16 @@ from trpc_agent_sdk.skills import SkillToolSet
from trpc_agent_sdk.skills import create_default_skill_repository
from trpc_agent_sdk.code_executors import create_local_workspace_runtime
from trpc_agent_sdk.code_executors import create_container_workspace_runtime
# Cube is an optional extra (`pip install 'trpc-agent-py[cube]'`); import lazily.
# from trpc_agent_sdk.code_executors.cube import CubeCodeExecutor, CubeCodeExecutorConfig
# from trpc_agent_sdk.code_executors.cube import create_cube_workspace_runtime

# Create workspace runtime (local or container)
# Create workspace runtime (local, container, or cube)
workspace_runtime = create_local_workspace_runtime()
# Or use container: workspace_runtime = create_container_workspace_runtime()
# Or use a remote Cube/E2B sandbox:
# executor = await CubeCodeExecutor.create(CubeCodeExecutorConfig())
# workspace_runtime = create_cube_workspace_runtime(executor)

# Create skill repository
repository = create_default_skill_repository("./skills", workspace_runtime=workspace_runtime)
Expand Down Expand Up @@ -218,7 +224,7 @@ python3 run_agent.py
```

Example skill (excerpt):
[examples/skills/skills/python_math/SKILL.md](../../../examples/skills/skills/python_math/SKILL.md)
[examples/skills/skills/python-math/SKILL.md](../../../examples/skills/skills/python-math/SKILL.md)

Tips:
- Describe the task you want to accomplish; the model will decide whether a skill is needed based on the overview.
Expand Down Expand Up @@ -1073,11 +1079,20 @@ LLM calls skill_run(skill="python-math", command="python3 scripts/fib.py 10")
- Executes commands directly on the local system, suitable for development and testing
- **Container executor** (Docker): [trpc_agent_sdk/code_executors/container/_container_ws_runtime.py](../../../trpc_agent_sdk/code_executors/container/_container_ws_runtime.py)
- Executes in Docker containers, providing better isolation
- **Cube executor** (remote E2B sandbox): [trpc_agent_sdk/code_executors/cube/_runtime.py](../../../trpc_agent_sdk/code_executors/cube/_runtime.py)
- Executes inside a remote Cube/E2B sandbox; suitable for environments without local Docker, or when strong remote isolation is required
- Construct via `create_cube_workspace_runtime(executor, workspace_cfg=...)`; see [code_executor.md](code_executor.md#cubeworkspaceruntime) for details
- Requires the optional `[cube]` extra (`pip install 'trpc-agent-py[cube]'`) and the `E2B_API_URL` / `E2B_API_KEY` / `CUBE_TEMPLATE_ID` environment variables (or equivalent cfg fields)

**Container executor notes**:
- The run base directory is writable; when `$SKILLS_ROOT` is set, it is mounted in read-only mode
- Network access is disabled by default for reproducibility and security

**Cube executor notes**:
- File and directory transfers use a tar-based protocol so directory upload/download stays a single round-trip and preserves symlinks/permissions
- The remote workspace root defaults to `/workspace/cube_agent`; per-execution subtrees follow the `ws_<exec_id>_<suffix>` naming convention and are recreated lazily on every `create_workspace` call (so external sandbox cleanup heals transparently)
- The same Cube sandbox can back both the bare `CubeCodeExecutor` and the workspace runtime; commands share `execute_timeout` from `CubeCodeExecutorConfig`

**Security and resource limits**:
- **Workspace isolation**: All read/write operations are confined within the workspace
- **Risk control**: Reduces security risks through timeout mechanisms and read-only skill trees
Expand Down Expand Up @@ -2135,6 +2150,6 @@ The **Dynamic Tool Selection** mechanism has been fully implemented and verified
- Dynamic tool selection full example: [examples/skills_with_dynamic_tools/run_agent.py](../../../examples/skills_with_dynamic_tools/run_agent.py)
- Example structure guide: [examples/skills/README.md](../../../examples/skills/README.md)
- Example skills:
- [examples/skills/skills/python_math/SKILL.md](../../../examples/skills/skills/python_math/SKILL.md)
- [examples/skills/skills/python-math/SKILL.md](../../../examples/skills/skills/python-math/SKILL.md)
- [examples/skills/skills/file_tools/SKILL.md](../../../examples/skills/skills/file_tools/SKILL.md)
- [examples/skills/skills/user_file_ops/SKILL.md](../../../examples/skills/skills/user_file_ops/SKILL.md)
Loading
Loading