Feat/webCaseCenter by liciazhu · Pull Request #615 · TencentCloud/CubeSandbox

liciazhu · 2026-06-22T09:03:56Z

add cube web case center and mod web sandbox detail

cubesandboxbot · 2026-06-25T02:04:47Z

+            .env("CUBE_API_ENVD_AUTH", &self.envd_auth)
+            .current_dir(&work_dir);
+
+        if std::env::var("CUBE_API_KEY").is_err() {


Security: Hardcoded default CUBE_API_KEY

When CUBE_API_KEY is not set in the parent process environment, the example subprocess receives a well-known default key (cube_0000000000...). Any user who triggers an example run can extract this key from the process environment and use it to authenticate arbitrary API calls.

Either remove the fallback entirely (scripts should handle a missing key gracefully) or use an obviously non-functional sentinel (e.g., empty string) that scripts can detect and refuse to use.

cubesandboxbot · 2026-06-25T02:04:48Z

+        }
+
+        // 3. Any healthy/ready template
+        if let Ok(tpls) = templates.list_templates().await {


Performance: Template validation iterates healthy candidates twice + makes 2N gRPC calls

Line 448 chains tpls.iter() (the full set) after the filtered subset, so every template is visited at least once and healthy/ready templates are validated twice. The inner templates.get_template() call is a gRPC round-trip to CubeMaster — in the worst case this makes 2N sequential gRPC calls.

The fix: replace .chain(tpls.iter()) with .chain(tpls.iter().filter(|t| t.status != "healthy" && t.status != "ready")), or better, inline the status check from list_templates and skip get_template entirely for templates already confirmed as healthy/ready.

cubesandboxbot · 2026-06-25T02:04:49Z

+        }
+    };
+
+    let fingerprint = {


DefaultHasher uses random seeds per-process — cache misses after every restart

DefaultHasher::new() uses RandomState with per-process seeds, so the same requirements.txt content produces a different hash after every server restart. The fingerprint stamp file written by a previous process lifetime will never match, causing pip3 install to rerun unconditionally on the first example run after each restart.

Replace with a deterministic hash (e.g., SipHasher::new_with_keys(0, 0)) or simply compare the file content directly for these trivially small requirements files.

cubesandboxbot · 2026-06-25T02:04:51Z

+            .map_err(anyhow::Error::from)
+    }
+
+    #[allow(dead_code)]


Dead code with explicit #[allow(dead_code)] — will drift from schema

This method is never called (explicitly acknowledged by the attribute). DB access methods that go unused will silently drift as columns are renamed/added/removed, then fail when someone finally calls them. Either wire it into a real caller or remove it.

cubesandboxbot · 2026-06-25T02:04:54Z

+fn default_sandbox_proxy_url() -> String {
+    std::env::var("AGENTHUB_SANDBOX_PROXY_URL").unwrap_or_else(|_| "http://127.0.0.1".to_string())
+}
+fn default_envd_auth() -> String {


Security: Default envd_auth is a well-known base64 value

The default Basic cm9vdDo= decodes to root: with an empty password — the envd built-in default. This shared credential is sent to every sandbox's Jupyter/envd endpoint. Anyone who learns it (including by running code in their own sandbox and observing request headers) can access the in-sandbox services of other sandboxes. Use per-sandbox credentials (the sandbox detail already includes an envd_access_token field) instead of a cluster-wide shared secret.

cubesandboxbot · 2026-06-25T02:04:55Z

-    // Pull all images in parallel, then inspect each.
-    let handles: Vec<_> = STORE_IMAGES
+/// Pulls all store images from the registry then returns the updated metadata.
+pub async fn refresh_store_meta(State(state): State<AppState>) -> impl IntoResponse {


Security & performance: Concurrent Docker pulls with no rate limiting

This spawns one docker pull --quiet per catalog image concurrently via spawn_blocking. With 6+ images (some multi-GB), this can saturate Docker daemon capacity and network bandwidth. Also, POST /store/refresh should use with_auth_and_rate_limit in the router (like the node isolation endpoint) rather than plain with_auth. Consider adding a semaphore to cap concurrent pulls (e.g., 2-3 at a time).

cubesandboxbot · 2026-06-25T02:06:19Z

+            .env("CUBE_API_ENVD_AUTH", &self.envd_auth)
+            .current_dir(&work_dir);
+
+        if std::env::var("CUBE_API_KEY").is_err() {


Security: Hardcoded default CUBE_API_KEY

When CUBE_API_KEY is not set in the parent process environment, the example subprocess receives a well-known default key (cube_0000000000...). Any user who triggers an example run can extract this key from the process environment and use it to authenticate arbitrary API calls.

Either remove the fallback entirely (scripts should handle a missing key gracefully) or use an obviously non-functional sentinel (e.g., empty string) that scripts can detect and refuse to use.

cubesandboxbot · 2026-06-25T02:06:20Z

+        }
+
+        // 3. Any healthy/ready template
+        if let Ok(tpls) = templates.list_templates().await {


Performance: Template validation iterates healthy candidates twice + makes 2N gRPC calls

Line 448 chains tpls.iter() (the full set) after the filtered subset, so every template is visited at least once and healthy/ready templates are validated twice. The inner templates.get_template() call is a gRPC round-trip to CubeMaster — in the worst case this makes 2N sequential gRPC calls.

The fix: replace .chain(tpls.iter()) with .chain(tpls.iter().filter(|t| t.status != "healthy" && t.status != "ready")), or better, inline the status check from list_templates and skip get_template entirely for templates already confirmed as healthy/ready.

cubesandboxbot · 2026-06-25T02:06:20Z

+        }
+    };
+
+    let fingerprint = {


DefaultHasher uses random seeds per-process → cache misses after every restart

DefaultHasher::new() uses RandomState with per-process seeds, so the same requirements.txt content produces a different hash after every server restart. The fingerprint stamp file written by a previous process lifetime will never match, causing pip3 install to rerun unconditionally on the first example run after each restart.

Replace with a deterministic hash (e.g., SipHasher::new_with_keys(0, 0)) or simply compare the file content directly for these trivially small requirements files.

fslongjin · 2026-06-25T06:35:55Z

+/// calls return the same `&'static` slice, so the front-end gets a stable
+/// ordering without the borrow checker complaining about temporaries.
+pub fn scenario_registry() -> &'static [ScenarioSpec] {
+    Box::leak(Box::new(vec![


scenario_registry() currently allocates and leaks a new Vec<ScenarioSpec> on every call, even though the comment says the registry is materialized once. Endpoints such as list, source, and run can call this repeatedly, so a long-lived API process will grow memory over time. Please store the vector behind OnceLock/LazyLock and return the initialized slice, ideally with a regression test that repeated calls reuse the same registry.

fslongjin · 2026-06-25T06:35:56Z

+        let start = Instant::now();
+
+        let output = match req.language.as_str() {
+            "python" => self.exec_python(sandbox_id, domain, &req.code).await?,


timeout_secs is only used after the Jupyter/envd request has already returned; the service still awaits exec_python/exec_bash without a real deadline. A hung or slow sandbox request can keep the handler occupied, and larger user timeouts will still be cut off by the outer 30s route timeout. Please wrap the actual execution future in tokio::time::timeout and align the route timeout with the allowed request budget.

fslongjin · 2026-06-25T06:35:57Z

+
+        let start = Instant::now();
+        let max_secs = sc.timeout_secs.unwrap_or(120);
+        let run_result = timeout(Duration::from_secs(max_secs), cmd.output()).await;


Timing out cmd.output() drops the future but does not explicitly kill the spawned process or its process tree, so long-running Python/Go/Chromium children can continue after the API has returned. This path also installs requirements.txt during a request, mutating the server's Python environment and using a source-tree stamp file without concurrency control. Please spawn and kill/wait the child on timeout, and move dependency installation to build/deploy time or an isolated, locked cache that fails closed.

fslongjin · 2026-06-25T06:35:59Z

+fn build_examples_run_routes(state: &AppState, auth_configured: bool) -> Router<AppState> {
+    let routes = Router::new().route("/examples/run", post(examples::run_example));
+    if auth_configured {
+        routes.layer(middleware::from_fn_with_state(state.clone(), unified_auth))


POST /examples/run is moved to a long-timeout router with auth only, but it no longer gets the rate limiting used by other resource-creating or execution endpoints. Because this route starts local processes and can create sandboxes or browser sessions for up to several minutes, missing rate limits increases abuse and resource-exhaustion risk. Please apply the existing auth+rate-limit layer here, or add a dedicated concurrency/rate budget for example runs.

fslongjin · 2026-06-25T06:36:00Z

+// `hidden: true` filter so even an attacker who knows the IDs cannot reach
+// these scenarios through the HTTP API.
+
+export const EXAMPLE_SCENARIOS: ExampleScenario[] = [


This adds a second hand-maintained example registry in the frontend while the backend registry also defines scenario IDs, files, categories, store items, visibility, and run metadata. The two sources can drift, producing UI entries that cannot run or backend examples with stale labels/topology in the UI. Please make the backend /examples response include the UI metadata, or generate both Rust and TS registries from a shared data source.

fslongjin · 2026-06-25T06:36:02Z

+pub async fn exec_code(
+    State(state): State<AppState>,
+    Path(sandbox_id): Path<String>,
+    Json(body): Json<ExecCodeRequest>,


ExecCodeRequest derives Validate, but this handler only extracts Json(body) and never calls body.validate() before executing the request. As a result, the documented code length constraint is not enforced and the API can accept payloads larger than the model contract suggests. Please validate here and map failures to a 400 response, or use a shared validated-JSON extractor.

fslongjin · 2026-06-25T06:36:03Z

This PR adds the compiled examples/cube-bench/cube-bench binary alongside the Go source. Committing the build artifact substantially increases PR size, is platform/architecture-specific, and makes supply-chain review noisier. Please remove the binary and add an ignore rule for this executable or for example build outputs.

fslongjin · 2026-06-25T06:36:04Z

+                    exit_code,
+                    success,
+                    elapsed_ms,
+                    steps: Vec::new(),


The run response now exposes steps, and the frontend adds a StepTimeline, but the service always returns steps: Vec::new() and the page does not render the timeline from backend data. This leaves a partially implemented API/UI surface that users cannot observe and future maintainers may assume is complete. Please either generate and render real step logs, or remove the unused field/component/comment until the feature is wired through.

fslongjin · 2026-06-25T06:36:06Z

 """

 with Sandbox.create(template=template_id) as sandbox:
+


This added blank line contains trailing whitespace, so git diff --check still fails for the PR. Please remove the stray spaces (and any other diff-check formatting issues) to avoid CI or review noise.

fslongjin · 2026-06-25T06:36:33Z

+    </div>
+  );
+}
+


git diff --check also reports this as a new blank line at EOF. Please remove the extra trailing blank line so the PR is clean for whitespace checks.

Signed-off-by: liciazhu <liciazhu@tencent.com>

…SandboxCases Signed-off-by: liciazhu <liciazhu@tencent.com>

Signed-off-by: liciazhu <liciazhu@tencent.com>

… prefix Signed-off-by: liciazhu <liciazhu@tencent.com>

…put, centralise credentials Signed-off-by: liciazhu <liciazhu@tencent.com>

…stry Signed-off-by: liciazhu <liciazhu@tencent.com>

Signed-off-by: liciazhu <liciazhu@tencent.com>

…licy Signed-off-by: liciazhu <liciazhu@tencent.com>

POST /store/refresh previously spawned an unbounded number of concurrent blocking tasks — one docker-pull per catalog entry — which could saturate Docker daemon and network bandwidth when the catalog is large. Introduce a tokio Semaphore (MAX_CONCURRENT_PULLS = 3) so that at most 3 docker pulls run simultaneously. Each catalog image acquires a permit before entering the spawn_blocking call and releases it automatically on drop. Assisted-by: CodeBuddy:claude-sonnet-4-6 Signed-off-by: liciazhu <liciazhu@tencent.com>

…ement Improvement in scenario_registry();Consistent Auth + Rate Limiting on /examples/run;Remove Committed Binary from Repository;Whitespace Formatting Fixes Signed-off-by: liciazhu <liciazhu@tencent.com>

…xec-code Signed-off-by: liciazhu <liciazhu@tencent.com>

Signed-off-by: liciazhu <liciazhu@tencent.com>

cubesandboxbot · 2026-06-25T09:45:35Z

+            .env("CUBE_TEMPLATE_ID", &template_id)
+            .env("SSL_CERT_FILE", ssl_cert)
+            .env("AGENTHUB_SANDBOX_PROXY_URL", &self.sandbox_proxy_url)
+            .env("CUBE_API_ENVD_AUTH", &self.envd_auth)


Security: envd_auth credential leaked into example subprocess environment

The full envd_auth header value (credential for the sandbox proxy) is injected as the CUBE_API_ENVD_AUTH environment variable into every example subprocess. Example scripts (including user-edited code in req.code) can read os.environ["CUBE_API_ENVD_AUTH"] and authenticate as root to the sandbox proxy, bypassing API-level authorization.

Impact: Escalation from "can run an example" to "can execute arbitrary code inside any sandbox."

Suggestion: Do not pass envd_auth to subprocesses. If example scripts need proxy access, create a scoped time-limited token instead.

cubesandboxbot · 2026-06-25T09:45:36Z

+                if let Ok(tpls) = templates.list_templates().await {
+                    let matched = tpls.iter().find(|t| {
+                        (t.status == "healthy" || t.status == "ready")
+                            && t.image_info.as_deref() == Some(image_ref.as_str())
+                    });
+                    if let Some(t) = matched {
+                        tracing::info!(
+                            store_item_id = %sid,
+                            image = %image_ref,
+                            template_id = %t.template_id,
+                            "matched template via store_item_id"
+                        );
+                        return Ok(t.template_id.clone());
+                    }
+                }
+            }
+        }
+
+        // 3. Any healthy/ready template
+        if let Ok(tpls) = templates.list_templates().await {


Performance: N+1 HTTP cascade in template resolution

resolve_template_id performs up to ~15 sequential HTTP round-trips to CubeMaster per example run: up to 3 individual get_template calls for candidates (line ~405), then a list_templates call (line ~429), then another list_templates call (line ~448, duplicating the result from line ~429), then iterates all returned templates calling get_template individually for each.

Impact: Latency floor of O(N) sequential network hops to CubeMaster on every example run.

Suggestion:

Call list_templates once and reuse the result

Check the list's own status field rather than re-verifying each candidate with get_template (the list already returns status)

Use tokio::try_join_all for candidate probing in the first step

cubesandboxbot · 2026-06-25T09:45:37Z

+    std::env::var("CUBE_API_DEFAULT_KEY")
+        .ok()
+        .filter(|s| !s.is_empty())
+        .or_else(|| Some("cube_0000000000000000000000000000000000000000".to_string()))


Security: Hardcoded fallback API key

When CUBE_API_DEFAULT_KEY is not set, this fallback value cube_0000000000000000000000000000000000000000 is used as an API key and injected into every example subprocess. This is a known static key shared across all deployments that don't explicitly configure it — effectively a backdoor credential.

Suggestion: Remove the hardcoded fallback and require explicit configuration. Log a loud warning at startup if the env var is unset.

cubesandboxbot · 2026-06-25T09:45:38Z

+        className="jupyter-svg-output max-h-[300px] overflow-auto rounded-md bg-white p-2 ring-1 ring-border/60"
+        dangerouslySetInnerHTML={{
+          __html: DOMPurify.sanitize(result.svg, {
+            USE_PROFILES: { svg: true, svgFilters: true },


Security: SVG XSS via DOMPurify's SVG profile

Jupyter result.svg payloads are sanitized with DOMPurify.sanitize(result.svg, { USE_PROFILES: { svg: true, svgFilters: true } }). The SVG profile allows <script>, event handlers (onload, etc.), and <use> with external references. An SVG like <svg><script>fetch(document.cookie)</script></svg> would execute in the WebUI origin.

Impact: Stored XSS — any user who can produce a Jupyter SVG output can run JS in administrators' browsers.

Suggestion: Either strip SVG output entirely (render as text), or use a stricter sanitizer that removes scripts, event handlers, and external references from SVGs.

cubesandboxbot · 2026-06-25T09:45:39Z

+  const out: Node<TopologyNodeData>[] = [];
+  for (let col = 0; col < layers.length; col++) {
+    const layer = layers[col];
+    const controlInLayer = layer.filter((id) => nodes.find((n) => n.id === id)?.plane === 'control');


Performance: O(N²) scan in topology layout

nodes.find((n) => n.id === id) is called in the innermost loop of layoutNodes, scanning the entire nodes array for each ID in each layer. For a topology with ~20 nodes this is fine, but the pattern repeats 3× per layer (control filter, data filter, node lookup).

Suggestion: Build a Map<string, LocalizedNode> outside the loop and use .get(id) — reduces from O(N²) to O(N) and is a one-line change.

cubesandboxbot · 2026-06-25T09:45:40Z

+            if program == "go" {
+                let dir_name = format!(".tmp_run_{}", Uuid::new_v4());
+                let dir = base_dir.join(&dir_name);
+                std::fs::create_dir_all(&dir).map_err(|e| {


Performance: Blocking std::fs calls on async runtime

Filesystem operations (create_dir_all, write, read_to_string, remove_dir_all, remove_file, copy) run directly on the tokio async thread. Under high concurrency, each blocks a thread that could otherwise process requests.

Suggestion: Use tokio::fs::* equivalents or wrap in tokio::task::spawn_blocking.

cubesandboxbot · 2026-06-25T09:45:41Z

+/// - `stderr` → `{ "type": "stderr", "text": "..." }`
+/// - `result` → `{ "type": "result", "text": "...", ... }`
+/// - `error`  → `{ "type": "error", "name": "...", "value": "...", "traceback": [...] }`
+fn parse_jupyter_ndjson(bytes: &[u8]) -> AppResult<JupyterOutput> {


Test Coverage: Untested pure parsing functions

This file contains several pure, deterministic parsing functions — parse_jupyter_ndjson (line 295), parse_connect_stream (line 374), build_response (line 264), connect_envelope (line 366), decode_b64_lossy (line 427), and parse_exit_status (line 434) — with zero test coverage. These parse raw binary streams from envd/Jupyter; malformed input would crash the handler or produce wrong results.

The rest of the crate has an established #[cfg(test)] pattern (12+ files use it). These parsing functions are ideal candidates for unit tests as they require no network, subprocess, or database.

liciazhu requested review from chenhengqi, fslongjin, ls-ggg, tinklone and up2wing as code owners June 22, 2026 09:03

liciazhu force-pushed the feat/webCaseCenterV5 branch 3 times, most recently from b9cb05d to 321e017 Compare June 25, 2026 01:52

cubesandboxbot Bot reviewed Jun 25, 2026

View reviewed changes

fslongjin reviewed Jun 25, 2026

View reviewed changes

liciazhu force-pushed the feat/webCaseCenterV5 branch from 6cfb21c to 488bea9 Compare June 25, 2026 07:33

liciazhu added 2 commits June 25, 2026 17:11

add example center v0.1

24d14e1

Signed-off-by: liciazhu <liciazhu@tencent.com>

mod example-center copy func

8a3bef6

Signed-off-by: liciazhu <liciazhu@tencent.com>

liciazhu added 21 commits June 25, 2026 17:11

add storage for settings in demo case

704fcbb

Signed-off-by: liciazhu <liciazhu@tencent.com>

change place for sandboxdetail terminal

40fafb7

Signed-off-by: liciazhu <liciazhu@tencent.com>

mod sandboxdetail style

1c688df

Signed-off-by: liciazhu <liciazhu@tencent.com>

mod web dashboard style

1ed0e99

Signed-off-by: liciazhu <liciazhu@tencent.com>

chore: apply code formatting

84b6dee

Signed-off-by: liciazhu <liciazhu@tencent.com>

add download for sandboxdetail

6cfb029

Signed-off-by: liciazhu <liciazhu@tencent.com>

mod menu style

ba12bd1

Signed-off-by: liciazhu <liciazhu@tencent.com>

add sandboxdetail event download with txt and csv

91f132c

Signed-off-by: liciazhu <liciazhu@tencent.com>

fix(web): resolve tsc errors and remove dead code in TemplateStore & …

6479c85

…SandboxCases Signed-off-by: liciazhu <liciazhu@tencent.com>

feat(templatestore):rely on cubemaster migration 0010

2b94519

Signed-off-by: liciazhu <liciazhu@tencent.com>

chore: apply cargo fmt formatting

8d5b108

Signed-off-by: liciazhu <liciazhu@tencent.com>

mod(templatestore)add agent template store sql

80a7987

Signed-off-by: liciazhu <liciazhu@tencent.com>

refactor(cubeapi): move exec-code endpoint from e2b surface to /cube/…

49bfaf0

… prefix Signed-off-by: liciazhu <liciazhu@tencent.com>

refactor(CubeAPI): extract exec-code service layer, sanitize rich out…

f561113

…put, centralise credentials Signed-off-by: liciazhu <liciazhu@tencent.com>

refactor(cubeapi): extract ExampleService and modularise example regi…

17d2b89

…stry Signed-off-by: liciazhu <liciazhu@tencent.com>

chore: apply cargo fmt formatting

2c3a792

Signed-off-by: liciazhu <liciazhu@tencent.com>

chore(web): untrack auto-generated tsconfig.tsbuildinfo

4162c0b

Signed-off-by: liciazhu <liciazhu@tencent.com>

fix(migrate): rename 0011 to UTC timestamp per immutable migration po…

1ea9ad9

…licy Signed-off-by: liciazhu <liciazhu@tencent.com>

mod cubeapi/web :Remove Incomplete Step Timeline Feature;Memory Manag…

aaa6436

…ement Improvement in scenario_registry();Consistent Auth + Rate Limiting on /examples/run;Remove Committed Binary from Repository;Whitespace Formatting Fixes Signed-off-by: liciazhu <liciazhu@tencent.com>

feat(cubeapi): add per-request timeout for POST /cube/sandboxes/:id/e…

2f8f47f

…xec-code Signed-off-by: liciazhu <liciazhu@tencent.com>

liciazhu force-pushed the feat/webCaseCenterV5 branch from 488bea9 to 2f8f47f Compare June 25, 2026 09:12

chore: apply code formatting

db07d7d

Signed-off-by: liciazhu <liciazhu@tencent.com>

cubesandboxbot Bot reviewed Jun 25, 2026

View reviewed changes

Uh oh!

Conversation

liciazhu commented Jun 22, 2026

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cubesandboxbot Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants