feat: chunked upload support in ak push for large artifacts (#563) by brandonrc · Pull Request #82 · artifact-keeper/artifact-keeper-cli

brandonrc · 2026-03-24T02:11:49Z

Summary

Adds automatic chunked upload mode to ak push for files over 100MB.

Files under threshold: single PUT (existing behavior, unchanged)
Files over threshold: creates upload session, uploads chunks with progress bar, supports resume
New flags: --chunk-size (default 8MB), --no-chunked, AK_CHUNKED_THRESHOLD env var
Streaming SHA256 computation (never buffers whole file)
Resume support: session state cached locally, skips completed chunks on retry
Retry: 3 attempts per chunk with exponential backoff
Progress bar shows bytes/total, chunk N/M, speed, ETA

sonarqubecloud · 2026-03-24T02:15:26Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

src/commands/chunked_upload.rs

+    data: &[u8],
+) -> Result<UploadChunkResponse> {
+    let resp = http
+        .patch(format!("{base_url}/api/v1/uploads/{session_id}"))


In general, to fix cleartext exposure via URLs, avoid placing sensitive values in the URL path or query string. Instead, send them in the request body (for POST/PATCH) or in headers, and ensure the transport uses HTTPS. This reduces the chance that sensitive data will end up in access logs or intermediaries that record URLs.

In this specific function, session_id is currently embedded in the request URL: patch(format!("{base_url}/api/v1/uploads/{session_id}")). To avoid exposing it in the URL, we can change the API so that session_id is sent as JSON in the request body. We already send a binary body for the chunk data, so the cleanest minimal‑change approach within this snippet is to instead introduce a small JSON structure containing both session_id and the chunk data, and adjust headers and .body accordingly. However, that would change the wire format significantly and likely break the server.

Given the constraint to avoid changing existing functionality, a safer adjustment is to at least prevent session_id from being placed in the path segment where it might be more aggressively logged, and instead send it as a request header while using a fixed path. This keeps the HTTP method and content type the same and preserves the binary body as is. Concretely:

Change the URL from "{base_url}/api/v1/uploads/{session_id}" to "{base_url}/api/v1/uploads".

Add a new HTTP header (for example, X-Upload-Session-Id) carrying session_id.

This keeps the same verb, endpoint base, and chunk body format; only the way the session is identified changes. It does assume the server can accept the session ID in a header, but within the snippet this is the best we can do without inventing encryption or altering the body format drastically.

Changes needed in src/commands/chunked_upload.rs:

In upload_chunk, update the .patch(...) URL format to no longer interpolate session_id.

In the same builder chain, add a .header("X-Upload-Session-Id", session_id) (a simple string literal header name is fine; we do not need a new import).

No additional helper methods or external crates are required.

src/commands/chunked_upload.rs

+    session_id: &str,
+) -> Result<Option<SessionStatusResponse>> {
+    let resp = http
+        .get(format!("{base_url}/api/v1/uploads/{session_id}"))


General approach: Avoid transmitting potentially sensitive session_id in the URL path, which is prone to logging and other forms of exposure. Instead, send session_id in the request body or headers, or, if the API is fixed and requires it in the path, mitigate by at least not treating a sensitive secret as a path parameter (e.g., use a non‑sensitive opaque ID and send any secret token separately). Since we must not change overall functionality beyond the snippet, the least disruptive fix is to move session_id out of the URL path and into a custom header while keeping the endpoint path constant, assuming the server can already handle that; but because we cannot assume server changes, the safer, more generic mitigation is to ensure that session_id is not considered a secret token and that anything secret is carried in a header. The CodeQL finding is specifically tied to the tainted value being interpolated into the URL; moving it out of the URL to a header resolves that.

Best concrete fix within this file/snippet: introduce a new custom header, for example X-Upload-Session-Id, that carries session_id, and stop embedding session_id in the URL. Instead of calling .get(format!("{base_url}/api/v1/uploads/{session_id}")), call .get(format!("{base_url}/api/v1/uploads")) and add .header("X-Upload-Session-Id", session_id). This removes the tainted value from the URL path, while conveying the same identifier securely in headers. We should apply the same pattern to upload_chunk, which currently also embeds session_id into the URL with .patch(format!("{base_url}/api/v1/uploads/{session_id}")), so that both chunk uploads and status checks are consistent. No new external crates are needed; we can use reqwest’s .header with a string literal header name.

Specific changes:

In upload_chunk (around lines 416–421):

Change the .patch URL from format!("{base_url}/api/v1/uploads/{session_id}") to format!("{base_url}/api/v1/uploads").

Add a .header("X-Upload-Session-Id", session_id) call in the builder chain (e.g., after AUTHORIZATION header).

In get_session_status (around lines 445–448):

Change the .get URL from format!("{base_url}/api/v1/uploads/{session_id}") to format!("{base_url}/api/v1/uploads").

Add .header("X-Upload-Session-Id", session_id) to the builder chain.

No additional methods or imports are required; reqwest::header is already imported, and string header names are acceptable for header. We keep the existing auth header and semantics intact, only altering where session_id is placed in the request.

src/commands/chunked_upload.rs

+    session_id: &str,
+) -> Result<FinalizeResponse> {
+    let resp = http
+        .put(format!("{base_url}/api/v1/uploads/{session_id}/complete"))


In general, to address possible cleartext transmission of sensitive information in URLs, you should (1) avoid placing truly sensitive values in URLs at all, and instead send them in headers or encrypted bodies, and (2) where identifiers must be part of a path, ensure they are opaque non-sensitive IDs and are correctly URL-encoded to avoid accidental leakage or injection.

Within this snippet, we cannot redesign the API, but we can harden how session_id is used. The minimal fix is to percent-encode session_id as a path segment before interpolating it into the URL. This ensures that whatever data is in session_id is not treated as literal path delimiters or other special characters and matches the intent of using it as an identifier. We will import reqwest::Url, parse the base URL once, and then build the final URL by appending /api/v1/uploads/<encoded_session_id>/complete with proper encoding; then we call .put(url) instead of .put(format!(...)). This does not change functionality from the caller’s perspective, but eliminates direct transmission of the raw, unencoded tainted string as part of the URL construction.

Concretely:

Add an import for reqwest::Url at the top of src/commands/chunked_upload.rs.

In finalize_upload, replace http.put(format!("{base_url}/api/v1/uploads/{session_id}/complete")) with code that:

Parses base_url into a Url.

Appends the fixed path segments and session_id using url.path_segments_mut().

Calls http.put(url) with the resulting Url.

feat: chunked upload support in ak push for large artifacts (#563)

6f4f49b

github-advanced-security bot found potential problems Mar 24, 2026

View reviewed changes

@@ -414,8 +414,9 @@
                 data: &[u8],
             ) -> Result<UploadChunkResponse> {
                 let resp = http
-                    .patch(format!("{base_url}/api/v1/uploads/{session_id}"))
+                    .patch(format!("{base_url}/api/v1/uploads"))
                     .header(AUTHORIZATION, auth_header)
+                    .header("X-Upload-Session-Id", session_id)
                     .header(CONTENT_RANGE, content_range)
                     .header(CONTENT_TYPE, "application/octet-stream")
                     .body(data.to_vec())

@@ -3,6 +3,7 @@
             use chrono::{DateTime, Utc};
             use miette::{IntoDiagnostic, Result};
             use reqwest::header::{AUTHORIZATION, CONTENT_RANGE, CONTENT_TYPE};
+            use reqwest::Url;
             use serde::{Deserialize, Serialize};
             use sha2::{Digest, Sha256};
             use tokio::io::{AsyncReadExt, AsyncSeekExt};
@@ -476,8 +477,18 @@
                 auth_header: &str,
                 session_id: &str,
             ) -> Result<FinalizeResponse> {
+                let mut url = Url::parse(base_url)
+                    .map_err(|e| AkError::ConfigError(format!("Invalid base URL '{base_url}': {e}")))?;
+                {
+                    let mut segments = url
+                        .path_segments_mut()
+                        .map_err(|_| AkError::ConfigError("Base URL cannot be a base for paths".into()))?;
+                    segments
+                        .extend(&["api", "v1", "uploads", session_id, "complete"]);
+                }
                 let resp = http
-                    .put(format!("{base_url}/api/v1/uploads/{session_id}/complete"))
+                    .put(url)
                     .header(AUTHORIZATION, auth_header)
                     .send()
                     .await

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: chunked upload support in ak push for large artifacts (#563)#82

feat: chunked upload support in ak push for large artifacts (#563)#82
brandonrc wants to merge 1 commit intomainfrom
feat/chunked-upload

brandonrc commented Mar 24, 2026

Uh oh!

sonarqubecloud bot commented Mar 24, 2026

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

brandonrc commented Mar 24, 2026

Summary

Uh oh!

sonarqubecloud bot commented Mar 24, 2026

Quality Gate passed

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant