docs(integrations): add CrewAI integration guide#632
Conversation
|
|
||
| ```bash | ||
| cubemastercli tpl create-from-image \ | ||
| --image cube-sandbox-cn.tencentcloudcr.com/cube-sandbox/sandbox-code:latest \ |
There was a problem hiding this comment.
Missing international registry note. Readers outside China should use cube-sandbox-int.tencentcloudcr.com/... instead. Every other English doc in the repo (e.g., quickstart.md, bare-metal-deploy.md) adds a note like:
Use
cube-sandbox-int.tencentcloudcr.com/cube-sandbox/sandbox-code:latest(recommended for international access). If you are in mainland China, usecube-sandbox-cn.tencentcloudcr.com/cube-sandbox/sandbox-code:latestinstead.
Please add the same note here for consistency.
| 4. Configure CubeAPI and your LLM: | ||
|
|
||
| ```bash | ||
| export E2B_API_URL="http://<cube-api-host>:3000" |
There was a problem hiding this comment.
http:// transmits the API key in plaintext. Using http:// (not HTTPS) means the E2B_API_KEY credential is sent unencrypted on every API call. Please add a security note that http:// is only acceptable for local development on a trusted machine. For production deployments, either configure TLS on CubeAPI and use https://, or use http://127.0.0.1:3000 (loopback) to limit network exposure.
| ), | ||
| timeout=30, | ||
| ) | ||
| print(result) |
There was a problem hiding this comment.
Smoke test exits 0 even on failure. If cube_python.run() returns an error object instead of raising, this script prints it and exits code 0 — a CI run would pass even though Cube connectivity failed. Consider asserting on the expected output (e.g., assert "runtime" in str(result)) so this works as a real smoke test in automated environments.
| tools=[cube_python], | ||
| llm=create_llm(), | ||
| verbose=True, | ||
| ) |
There was a problem hiding this comment.
verbose=True may log the LLM API key. CrewAI's verbose mode can serialize the LLM configuration into stdout/logs, which includes api_key. Since this is a copy-paste reference example, consider making verbose opt-in via an environment variable (e.g., verbose=os.getenv("CREWAI_VERBOSE", "").lower() == "true"), or at minimum add a comment warning users to check their logs for credentials before sharing output.
Document the E2B-compatible CrewAI integration in English and Chinese, with a runnable agent demo and Cube connectivity smoke test. Closes TencentCloud#244 Autonomously-by: Codex:GPT-5 Signed-off-by: ruirui6946 <2733936092@qq.com>
b4afed0 to
1d9a27f
Compare
PR Review: CrewAI Integration Guide (#632)Overview: Bilingual CrewAI guide and runnable demo. High
Medium
Well Done
|
| ) | ||
| result_text = str(result) | ||
| if not all(fragment in result_text for fragment in ("runtime", "cube", "45")): | ||
| raise RuntimeError(f"Unexpected Cube smoke test result: {result_text}") |
There was a problem hiding this comment.
False positive risk from substring matching (severity: high)
The assertion all(fragment in result_text for fragment in ("runtime", "cube", "45")) uses loose substring matching that can produce false positives. If the sandbox returns an error traceback that incidentally contains these words (e.g., a NameError mentioning "runtime" in the frame, a "cube" path component, and line "45"), the test would pass despite execution failure.
The e2b_code_interpreter SDK's Execution object has separate stdout, stderr, and error fields — flattening everything into str() loses this distinction.
Suggestion: Parse the output as JSON and validate specific key-value pairs:
import json
parsed = json.loads(result_text)
assert parsed.get("runtime") == "cube"
assert parsed.get("sum") == 45| def create_llm() -> LLM: | ||
| """Create a CrewAI LLM from OpenAI or an OpenAI-compatible endpoint.""" | ||
| options: dict[str, Any] = { | ||
| "model": os.getenv("MODEL", "openai/gpt-4o-mini"), | ||
| "api_key": os.environ["OPENAI_API_KEY"], | ||
| } |
There was a problem hiding this comment.
create_llm() has a fragile dependency on require_environment() execution order (severity: high)
Line 27 uses os.environ["OPENAI_API_KEY"] which raises an opaque KeyError if this function is called before require_environment() — or standalone in a test/refactor. The function doesn't validate the key's presence itself.
Suggestion: Either (a) accept api_key: str as a parameter, (b) use os.getenv() and raise a descriptive error if missing, or (c) document the ordering dependency in the docstring.
| template=os.environ["CUBE_TEMPLATE_ID"], | ||
| persistent=False, | ||
| sandbox_timeout=120, | ||
| ) | ||
|
|
There was a problem hiding this comment.
sandbox_timeout has no effect when persistent=False (severity: medium)
In ephemeral mode (persistent=False), a fresh MicroVM is created and destroyed for each tool call, so the idle-timeout parameter sandbox_timeout=120 is never exercised. This is dead configuration that may mislead users who copy this pattern — when they later switch to persistent=True, they may not realize a short timeout was set.
Suggestion: Either omit sandbox_timeout when persistent=False, or add a code comment explaining it only applies to persistent sandboxes.
| ) | ||
|
|
||
| analyst = Agent( | ||
| role="Sandboxed data analyst", |
There was a problem hiding this comment.
No error handling around Crew.kickoff() (severity: medium)
Crew(...).kickoff() is not wrapped in any try/except block. Since this file serves as a reference integration that users will follow, adding error handling would help them debug LLM connectivity issues, API key problems, or sandbox timeouts.
Suggestion: Wrap with a try/except that catches common exceptions and prints a clear diagnostic message before re-raising.
| ### Mount host data | ||
|
|
||
| Host mounts are a Cube-specific extension encoded in sandbox metadata: | ||
|
|
||
| ```python | ||
| import json | ||
|
|
||
| mounts = json.dumps([ | ||
| { | ||
| "hostPath": "/srv/agent-input", | ||
| "mountPath": "/mnt/input", | ||
| "readOnly": True, | ||
| } | ||
| ]) | ||
|
|
||
| with Sandbox.create( | ||
| template=os.environ["CUBE_TEMPLATE_ID"], | ||
| metadata={"host-mount": mounts}, | ||
| ) as sandbox: | ||
| execution = sandbox.run_code( | ||
| "from pathlib import Path; print(list(Path('/mnt/input').iterdir()))" | ||
| ) | ||
| ``` | ||
|
|
||
| The host path must already exist on the Cubelet node. Prefer read-only mounts for agent inputs. |
There was a problem hiding this comment.
Host mount documentation lacks security guidance on arbitrary host filesystem access (severity: medium)
The guide shows how to mount arbitrary host paths via hostPath but provides minimal security guidance. There is no warning that:
hostPathvalues should be validated and allowlisted before being passed toSandbox.create()- Host mounts bypass the MicroVM isolation boundary for the mounted paths
- An attacker who controls sandbox metadata (e.g., through prompt injection against an agent) could specify arbitrary
hostPathvalues like/etc/kubernetes/,/var/lib/kubelet/pki/, or/root/.ssh/ - Read-write mounts allow modifying host filesystem state from within a sandbox
Suggestion: Add a ::: warning admonition block similar to the TLS warning, covering allowlisting, read-only preference, and the isolation-bypass risk.
| ## References | ||
|
|
||
| - [CrewAI E2B Sandbox Tools](https://docs.crewai.com/en/tools/ai-ml/e2bsandboxtools) | ||
| - [CrewAI custom tools](https://docs.crewai.com/en/learn/create-custom-tools) | ||
| - [Cube Sandbox Python examples](https://github.com/TencentCloud/CubeSandbox/tree/master/examples/code-sandbox-quickstart) |
There was a problem hiding this comment.
CrewAI reference URLs may need updating (severity: medium)
The URLs https://docs.crewai.com/en/tools/ai-ml/e2bsandboxtools and https://docs.crewai.com/en/learn/create-custom-tools use the /en/ language prefix. CrewAI's documentation site was restructured — the current live site likely serves these pages without the /en/ prefix.
Suggestion: Verify these URLs resolve correctly and update to the current path structure. Also applies to the Chinese guide at docs/zh/guide/integrations/crewai.md lines 263-264.
| ) | ||
| result_text = str(result) | ||
| if not all(fragment in result_text for fragment in ("runtime", "cube", "45")): | ||
| raise RuntimeError(f"Unexpected Cube smoke test result: {result_text}") |
There was a problem hiding this comment.
False positive risk from substring matching (severity: high)
The assertion all(fragment in result_text for fragment in ("runtime", "cube", "45")) uses loose substring matching that can produce false positives. If the sandbox returns an error traceback or metadata that incidentally contains these three substrings (e.g., a NameError mentioning a frame variable "runtime", a path component "cube", and line number "45"), the test would pass despite execution failure.
The e2b_code_interpreter SDK's Execution object has separate fields for stdout, stderr, and error — by flattening everything into str() and doing substring matching, the test cannot distinguish between successful output and error output.
Suggestion: Parse the output as JSON and validate specific key-value pairs. This would also catch malformed output:
import json
parsed = json.loads(result_text)
assert parsed.get("runtime") == "cube"
assert parsed.get("sum") == 45| def create_llm() -> LLM: | ||
| """Create a CrewAI LLM from OpenAI or an OpenAI-compatible endpoint.""" | ||
| options: dict[str, Any] = { | ||
| "model": os.getenv("MODEL", "openai/gpt-4o-mini"), | ||
| "api_key": os.environ["OPENAI_API_KEY"], | ||
| } |
There was a problem hiding this comment.
create_llm() has a fragile dependency on require_environment() execution order (severity: high)
Line 27 uses os.environ["OPENAI_API_KEY"] which raises an opaque KeyError if this function is called before require_environment() — or standalone in a test/refactor. The function doesn't validate the key's presence itself.
Suggestion: Either (a) accept api_key: str as a parameter, (b) use os.getenv("OPENAI_API_KEY") and raise a descriptive error if missing, or (c) document the ordering dependency in the docstring.
| template=os.environ["CUBE_TEMPLATE_ID"], | ||
| persistent=False, | ||
| sandbox_timeout=120, | ||
| ) | ||
|
|
There was a problem hiding this comment.
sandbox_timeout has no effect when persistent=False (severity: medium)
In ephemeral mode (persistent=False), a fresh MicroVM is created and destroyed for each tool call, so the idle-timeout parameter (sandbox_timeout=120) is never exercised. This is dead configuration that may mislead users who copy this pattern — when they later switch to persistent=True, they may not realize a short timeout was set.
Suggestion: Either omit sandbox_timeout when persistent=False, or add a code comment explaining it only applies to persistent sandboxes. This also applies in smoke_test.py and the guide's code snippets.
| ) | ||
|
|
||
| analyst = Agent( | ||
| role="Sandboxed data analyst", |
There was a problem hiding this comment.
No error handling around Crew.kickoff() (severity: medium)
Crew(...).kickoff() on line 43-44 is not wrapped in any try/except block. Since this file serves as a reference integration, adding error handling would help users debug LLM connectivity issues, API key problems, or sandbox timeouts.
Suggestion: Wrap with a try/except that catches common exceptions and prints a clear diagnostic message. E.g.:
try:
result = Crew(...).kickoff()
except Exception as exc:
raise RuntimeError(f"Crew execution failed: {exc}") from exc| ### Mount host data | ||
|
|
||
| Host mounts are a Cube-specific extension encoded in sandbox metadata: | ||
|
|
||
| ```python | ||
| import json | ||
|
|
||
| mounts = json.dumps([ | ||
| { | ||
| "hostPath": "/srv/agent-input", | ||
| "mountPath": "/mnt/input", | ||
| "readOnly": True, | ||
| } | ||
| ]) | ||
|
|
||
| with Sandbox.create( | ||
| template=os.environ["CUBE_TEMPLATE_ID"], | ||
| metadata={"host-mount": mounts}, | ||
| ) as sandbox: | ||
| execution = sandbox.run_code( | ||
| "from pathlib import Path; print(list(Path('/mnt/input').iterdir()))" | ||
| ) | ||
| ``` | ||
|
|
||
| The host path must already exist on the Cubelet node. Prefer read-only mounts for agent inputs. |
There was a problem hiding this comment.
Host mount documentation lacks security guidance on arbitrary host filesystem access (severity: medium)
The guide shows how to mount arbitrary host paths but provides minimal security guidance. There is no warning that:
hostPathvalues should be validated and allowlisted before being passed toSandbox.create()- Host mounts bypass the MicroVM isolation boundary for the mounted paths
- An attacker who controls sandbox metadata (e.g., through prompt injection against an agent that creates sandboxes) could specify arbitrary
hostPathvalues like/etc/kubernetes/,/var/lib/kubelet/pki/, or/root/.ssh/ - Read-write mounts allow modifying host filesystem state from within a sandbox
Given the CrewAI context where an LLM agent might construct sandbox parameters based on user prompts, this is a real attack surface.
Suggestion: Add a warning block similar to the TLS warning above, covering allowlisting, read-only preference, and the isolation-bypass risk.
| ## References | ||
|
|
||
| - [CrewAI E2B Sandbox Tools](https://docs.crewai.com/en/tools/ai-ml/e2bsandboxtools) | ||
| - [CrewAI custom tools](https://docs.crewai.com/en/learn/create-custom-tools) | ||
| - [Cube Sandbox Python examples](https://github.com/TencentCloud/CubeSandbox/tree/master/examples/code-sandbox-quickstart) |
There was a problem hiding this comment.
CrewAI reference URLs may need updating (severity: medium)
The reference URLs https://docs.crewai.com/en/tools/ai-ml/e2bsandboxtools and https://docs.crewai.com/en/learn/create-custom-tools use the /en/ language prefix. CrewAI's documentation site restructured — the current live site likely serves these pages without the /en/ prefix (e.g., https://docs.crewai.com/tools/e2bsandboxtools).
Also applies to docs/zh/guide/integrations/crewai.md at lines 263-264.
Suggestion: Verify these URLs resolve correctly and update to the current path structure.
Summary
examples/crewai-integrationdemo with CrewAI agent wiring and a Cube connectivity smoke test.Validation
python -m compileall examples\crewai-integrationD:\agent\.venv-crewai\Scripts\python.exe -m pip checkE2BPythonToolsandbox construction/run/cleanup throughexamples/crewai-integration/smoke_test.pynpm run docs:buildgit diff --checkRefs #244
Autonomously-by: Codex:GPT-5