Skip to content

feat: faster document creation, create returns metadata#4020

Open
404Wolf wants to merge 4 commits into
mainfrom
wolf/even-faster-creation
Open

feat: faster document creation, create returns metadata#4020
404Wolf wants to merge 4 commits into
mainfrom
wolf/even-faster-creation

Conversation

@404Wolf

@404Wolf 404Wolf commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Before
2d1085e8-04a8-463f-bfa1-841f0d8dbc91

After
6ba82657-990b-43ab-97fe-dc56f565153d

Not the best screenshots, but we no longer make a sequential metadata/location/token request on the hot path to creating a document

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bf6c5452-7de0-4480-a2b8-50523d844118

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR implements a comprehensive cached snapshot system for Loro documents. On the frontend, it introduces an in-memory cache for markdown document creation data and refactors block synchronization logic into two distinct async paths (optimistic and durable-object). The backend extends document creation endpoints to generate JWT permission tokens and return enriched responses with metadata. New snapshot storage endpoints enable retrieval and upload of cached Loro snapshots via S3, while the sync service periodically publishes shallow snapshots back to the document storage service. Supporting changes expand fetch utilities to handle binary responses, serve the markdown golden snapshot from a stable public URL, and configure environment variables for DSS connectivity across deployment environments.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title follows conventional commits format (feat:) and is 55 characters, well under the 72-character limit.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly relates to the changeset by showing performance improvements and indicating that the create operation now returns metadata.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

const entry = cache.get(documentId);
cache.delete(documentId);
return entry;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack for sure. Since we rely on the metadata signal in too many places I can't really break it out easily, but this lets us get the data for our loader "ahead of time"

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (3)
rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs (1)

94-114: ⚡ Quick win

Inconsistent token generation: use encode_permission_token helper.

create_markdown.rs uses the encode_permission_token helper function, but this handler manually constructs the JWT. This duplicates the token encoding logic and risks drift if the token format changes.

♻️ Suggested refactor
+use crate::domain::permission_token::encode_permission_token;
-use model::document::DocumentPermissionsToken;
-use std::time::{SystemTime, UNIX_EPOCH};
 
 // ...
 
-    let now = SystemTime::now()
-        .duration_since(UNIX_EPOCH)
-        .unwrap()
-        .as_secs() as usize;
-    let token = jsonwebtoken::encode(
-        &jsonwebtoken::Header::new(jsonwebtoken::Algorithm::HS256),
-        &DocumentPermissionsToken {
-            user_id: Some(user_context.macro_user_id.as_ref().to_string()),
-            document_id: document_id.clone(),
-            access_level: AccessLevel::Edit,
-            exp: now + 3600,
-            iss: "document_storage_service".to_string(),
-        },
-        &jsonwebtoken::EncodingKey::from_secret(
-            state.document_permission_jwt_secret.as_bytes(),
-        ),
-    )
-    .map_err(|e| {
-        tracing::error!(error=?e, "failed to encode permission token");
-        DocumentError::Internal(e.into())
-    })?;
+    let token = encode_permission_token(
+        Some(user_context.macro_user_id.as_ref().to_string()),
+        document_id.clone(),
+        AccessLevel::Edit,
+        &state.document_permission_jwt_secret,
+    )
+    .map_err(|e| {
+        tracing::error!(error=?e, "failed to encode permission token");
+        DocumentError::Internal(e.into())
+    })?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs` around
lines 94 - 114, The JWT is being built inline in create_task.rs (using
jsonwebtoken::encode and DocumentPermissionsToken) which duplicates logic;
replace that block with a call to the existing encode_permission_token helper
(the same function used by create_markdown.rs), passing the same inputs
(user_context.macro_user_id, document_id, AccessLevel::Edit, expiry ~ now + 3600
and the state.document_permission_jwt_secret) and propagate/map errors the same
way; remove the manual jsonwebtoken::encode usage and ensure the returned
token/error handling uses encode_permission_token instead.
rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs (1)

21-23: 💤 Low value

Consider a more explicit local-mode placeholder.

Returning the literal string "fake" could silently break if a caller attempts to parse or validate the URL structure without checking is_local_aws() first. A more self-documenting sentinel (e.g., "local://fake-presigned-url") or an early anyhow::bail! would make the local-mode contract clearer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs`
around lines 21 - 23, The current local-mode branch returns the literal "fake"
which is ambiguous; update the branch in internal_presigned_helpers.rs (the code
that currently does return Ok("fake".to_string())) to return a clear sentinel
like "local://fake-presigned-url" (so callers that parse URLs will see a valid
scheme) or, if you prefer failing fast, replace it with an anyhow::bail! with a
descriptive message indicating presigned URLs are unavailable in local mode;
ensure the change is applied where the local-mode return is implemented so
callers relying on URL parsing/validation behave correctly.
rust/cloud-storage/s3_key/src/document_key.rs (1)

9-11: 💤 Low value

Snapshot keys are not covered by the DocumentKey enum.

The new snapshot key format {SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id} lives in the same S3 bucket as other document keys, but DocumentKey::from_s3_key (line 64) cannot parse it. A 2-segment key starting with "sync_service_snapshot" would fall through to the error case (line 104) since it doesn't match TEMP_FILE_PREFIX.

Current usage never parses snapshot keys back from S3 (they're only constructed via the snapshot_key helper in the adapter), so this doesn't break anything today. However, future code that enumerates bucket keys and expects to parse them all via DocumentKey will silently fail on snapshots.

🔧 Optional: Extend `DocumentKey` to handle snapshots

Add a Snapshot variant to the enum:

 pub enum DocumentKey {
     // ... existing variants ...
+    /// A sync-service CRDT snapshot: `sync_service_snapshot/{document_id}`
+    Snapshot {
+        /// The document ID.
+        document_id: String,
+    },
 }

And update from_s3_key to recognize it:

         match split.len() {
-            2 if split[0] == TEMP_FILE_PREFIX => {
+            2 if split[0] == TEMP_FILE_PREFIX => { /* existing temp docx logic */ }
+            2 if split[0] == SYNC_SERVICE_SNAPSHOT_PREFIX => {
+                Ok(Self::Snapshot { document_id: split[1].to_string() })
+            }
             // ...
         }

Then update document_id(), to_key(), and add an is_snapshot() predicate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/s3_key/src/document_key.rs` around lines 9 - 11, Add
support for snapshot keys by extending the DocumentKey enum with a Snapshot
variant and updating parsing/serialization helpers: modify DocumentKey to
include Snapshot(document_id: String), update DocumentKey::from_s3_key to
recognize a 2-segment key whose first segment equals
SYNC_SERVICE_SNAPSHOT_PREFIX and return Snapshot(id), update DocumentKey::to_key
(or wherever keys are serialized) to emit
"{SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id}" for Snapshot, ensure
DocumentKey::document_id() returns the inner id for Snapshot, and add an
is_snapshot() predicate; reference SYNC_SERVICE_SNAPSHOT_PREFIX, DocumentKey,
DocumentKey::from_s3_key, DocumentKey::to_key, and DocumentKey::document_id().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@js/app/packages/block-md/component/Block.tsx`:
- Around line 84-122: The race condition occurs because gotDoSnapshot is set to
true only after calling loroManager.initializeFromSnapshot for the DO snapshot,
allowing the concurrent storageServiceClient.fetchCachedSnapshot path to also
call initializeFromSnapshot; to fix, set gotDoSnapshot = true immediately before
calling loroManager.initializeFromSnapshot in the DO sync path (i.e., set the
flag right after successful syncResult and before invoking
loroManager.initializeFromSnapshot), so the IIFE fetching cached snapshots will
bail out and not attempt initialization; keep the rest of the error handling and
returns unchanged.

In `@js/app/packages/service-clients/service-storage/client.ts`:
- Around line 1111-1115: The dssFetch generic is currently constrained to
Record<string, any>, which prevents calls like fetchCachedSnapshot using
dssFetch<Uint8Array>; update dssFetch's type signature and overloads to allow
binary responses (Uint8Array) similar to safeFetch/fetchWithToken: detect
response.content-type octet-stream and return new Uint8Array(await
response.arrayBuffer()) when T = Uint8Array, or otherwise keep the existing
JSON/Record behavior; adjust dssFetch's generic constraint to accept Uint8Array
| Record<string, any> (or make it unconstrained) and mirror the same runtime
branch used in safeFetch/fetchWithToken so fetchCachedSnapshot can call
dssFetch<Uint8Array>(...) without a type error.

In `@rust/cloud-storage/documents/src/domain/permission_token.rs`:
- Around line 18-21: The code in encode_permission_token uses
duration_since(UNIX_EPOCH).unwrap(), which can panic if system time is before
the epoch; replace the unwrap with proper error handling and propagate a Result
instead of panicking. Change the calculation of now to use
duration_since(UNIX_EPOCH).map_err(|e| /* map to your function's error type */)?
and then .as_secs() as usize (or match on the Result and return Err on failure)
so encode_permission_token returns an error instead of crashing; update the
function signature to return Result if it does not already and map the
SystemTimeError to the module's error type.

In `@rust/cloud-storage/documents/src/outbound/s3_upload_url.rs`:
- Around line 121-123: The early return in upload_snapshot (the is_local_aws()
guard in s3_upload_url.rs) causes local runs to report success without storing
bytes; remove that short-circuit and implement a local write path so snapshots
are actually persisted in non-AWS mode. Concretely, modify the upload_snapshot
function to no longer return Ok(()) when macro_aws_config::is_local_aws() is
true; instead branch to the same local storage implementation you use for tests
or write the snapshot bytes to the configured local storage location (or call
the existing local_store/local_write helper if one exists) so callers receive a
true write result and snapshot retrieval works in local mode. Ensure the change
touches the macro_aws_config::is_local_aws() check and the upload_snapshot
function logic only.

In `@rust/cloud-storage/github/src/domain/service/sync/test.rs`:
- Around line 261-267: The test stub implementations for get_snapshot and
upload_snapshot currently call unimplemented!() which can cause panics; change
get_snapshot in the test stub to return an inert Ok(None) and change
upload_snapshot to return Ok(()) so the trait methods are satisfied without
panicking (update the async fn get_snapshot and async fn upload_snapshot
implementations in the test DocumentService stub).

In `@rust/sync-service/wrangler.toml`:
- Around line 35-36: The DSS_INTERNAL_AUTH_KEY is hardcoded to "local" in the
default [vars] and is inherited by all envs; remove the hardcoded secret from
the default [vars] and instead add environment-specific overrides (e.g., in
[env.playground.vars], [env.dev.vars], [env.prod.vars]) to read the secret from
Wrangler/GCP/AWS secrets or from environment variables, ensuring the key names
match the DOCUMENT_STORAGE_SERVICE_AUTH_KEY used by DSS; update
DSS_INTERNAL_AUTH_KEY in each env block to reference the appropriate secret
rather than a literal "local".
- Line 59: The DSS_INTERNAL_AUTH_KEY is set to "local" globally in wrangler.toml
but must match document-storage’s DOCUMENT_STORAGE_SERVICE_AUTH_KEY so internal
requests (sent with header x-document-storage-service-auth-key in
rust/sync-service/src/dss_internal.rs) are accepted; update wrangler.toml to
provide DSS_INTERNAL_AUTH_KEY per environment (env.playground, env.dev,
env.prod) or change your deployment wiring to inject the same secret used for
document_storage_service_auth_key (the secret mapped to
DOCUMENT_STORAGE_SERVICE_AUTH_KEY in document_storage_service's
internal_access.rs) so the values match across services.

---

Nitpick comments:
In
`@rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs`:
- Around line 21-23: The current local-mode branch returns the literal "fake"
which is ambiguous; update the branch in internal_presigned_helpers.rs (the code
that currently does return Ok("fake".to_string())) to return a clear sentinel
like "local://fake-presigned-url" (so callers that parse URLs will see a valid
scheme) or, if you prefer failing fast, replace it with an anyhow::bail! with a
descriptive message indicating presigned URLs are unavailable in local mode;
ensure the change is applied where the local-mode return is implemented so
callers relying on URL parsing/validation behave correctly.

In `@rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs`:
- Around line 94-114: The JWT is being built inline in create_task.rs (using
jsonwebtoken::encode and DocumentPermissionsToken) which duplicates logic;
replace that block with a call to the existing encode_permission_token helper
(the same function used by create_markdown.rs), passing the same inputs
(user_context.macro_user_id, document_id, AccessLevel::Edit, expiry ~ now + 3600
and the state.document_permission_jwt_secret) and propagate/map errors the same
way; remove the manual jsonwebtoken::encode usage and ensure the returned
token/error handling uses encode_permission_token instead.

In `@rust/cloud-storage/s3_key/src/document_key.rs`:
- Around line 9-11: Add support for snapshot keys by extending the DocumentKey
enum with a Snapshot variant and updating parsing/serialization helpers: modify
DocumentKey to include Snapshot(document_id: String), update
DocumentKey::from_s3_key to recognize a 2-segment key whose first segment equals
SYNC_SERVICE_SNAPSHOT_PREFIX and return Snapshot(id), update DocumentKey::to_key
(or wherever keys are serialized) to emit
"{SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id}" for Snapshot, ensure
DocumentKey::document_id() returns the inner id for Snapshot, and add an
is_snapshot() predicate; reference SYNC_SERVICE_SNAPSHOT_PREFIX, DocumentKey,
DocumentKey::from_s3_key, DocumentKey::to_key, and DocumentKey::document_id().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5dcf7fb0-dd63-492c-ad08-e7f733a63681

📥 Commits

Reviewing files that changed from the base of the PR and between 210482e and 66ad298.

⛔ Files ignored due to path filters (7)
  • js/app/packages/app/public/markdown-golden.1.bin is excluded by !**/*.bin
  • js/app/packages/service-clients/service-storage/generated/schemas/createMarkdownDocumentResponse.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createMarkdownHandler200.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createTaskHandler200.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createTaskResponse.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/zod.ts is excluded by !**/generated/**
  • rust/cloud-storage/Cargo.lock is excluded by !**/*.lock, !**/Cargo.lock
📒 Files selected for processing (41)
  • docker-compose.yml
  • js/app/packages/app/index.html
  • js/app/packages/block-md/component/Block.tsx
  • js/app/packages/block-md/definition.ts
  • js/app/packages/block-md/util/mdCreateCache.ts
  • js/app/packages/core/component/LexicalMarkdown/collaboration/undo.ts
  • js/app/packages/core/util/create.ts
  • js/app/packages/core/util/fetchWithToken.ts
  • js/app/packages/core/util/safeFetch.ts
  • js/app/packages/service-clients/service-storage/client.ts
  • js/app/packages/service-clients/service-storage/openapi.json
  • js/lexical-core/markdown-golden.ts
  • js/lexical-core/markdown-loro-schema.ts
  • js/lexical-core/scripts/generate-golden.ts
  • rust/cloud-storage/document_storage_service/src/api/documents/mod.rs
  • rust/cloud-storage/document_storage_service/src/api/internal/mod.rs
  • rust/cloud-storage/document_storage_service/src/main.rs
  • rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs
  • rust/cloud-storage/document_storage_service/src/service/s3/mod.rs
  • rust/cloud-storage/documents/Cargo.toml
  • rust/cloud-storage/documents/src/domain.rs
  • rust/cloud-storage/documents/src/domain/models.rs
  • rust/cloud-storage/documents/src/domain/permission_token.rs
  • rust/cloud-storage/documents/src/domain/ports/mod.rs
  • rust/cloud-storage/documents/src/domain/service.rs
  • rust/cloud-storage/documents/src/domain/service/tests.rs
  • rust/cloud-storage/documents/src/inbound/axum_router.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/get_cached_snapshot_url.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/put_snapshot.rs
  • rust/cloud-storage/documents/src/outbound/s3_upload_url.rs
  • rust/cloud-storage/github/src/domain/service/sync/test.rs
  • rust/cloud-storage/s3_key/src/document_key.rs
  • rust/cloud-storage/s3_key/src/lib.rs
  • rust/sync-service/src/constants.rs
  • rust/sync-service/src/dss_internal.rs
  • rust/sync-service/src/durable_object.rs
  • rust/sync-service/src/lib.rs
  • rust/sync-service/wrangler.docker.toml
  • rust/sync-service/wrangler.toml

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 7

🧹 Nitpick comments (3)
rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs (1)

94-114: ⚡ Quick win

Inconsistent token generation: use encode_permission_token helper.

create_markdown.rs uses the encode_permission_token helper function, but this handler manually constructs the JWT. This duplicates the token encoding logic and risks drift if the token format changes.

♻️ Suggested refactor
+use crate::domain::permission_token::encode_permission_token;
-use model::document::DocumentPermissionsToken;
-use std::time::{SystemTime, UNIX_EPOCH};
 
 // ...
 
-    let now = SystemTime::now()
-        .duration_since(UNIX_EPOCH)
-        .unwrap()
-        .as_secs() as usize;
-    let token = jsonwebtoken::encode(
-        &jsonwebtoken::Header::new(jsonwebtoken::Algorithm::HS256),
-        &DocumentPermissionsToken {
-            user_id: Some(user_context.macro_user_id.as_ref().to_string()),
-            document_id: document_id.clone(),
-            access_level: AccessLevel::Edit,
-            exp: now + 3600,
-            iss: "document_storage_service".to_string(),
-        },
-        &jsonwebtoken::EncodingKey::from_secret(
-            state.document_permission_jwt_secret.as_bytes(),
-        ),
-    )
-    .map_err(|e| {
-        tracing::error!(error=?e, "failed to encode permission token");
-        DocumentError::Internal(e.into())
-    })?;
+    let token = encode_permission_token(
+        Some(user_context.macro_user_id.as_ref().to_string()),
+        document_id.clone(),
+        AccessLevel::Edit,
+        &state.document_permission_jwt_secret,
+    )
+    .map_err(|e| {
+        tracing::error!(error=?e, "failed to encode permission token");
+        DocumentError::Internal(e.into())
+    })?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs` around
lines 94 - 114, The JWT is being built inline in create_task.rs (using
jsonwebtoken::encode and DocumentPermissionsToken) which duplicates logic;
replace that block with a call to the existing encode_permission_token helper
(the same function used by create_markdown.rs), passing the same inputs
(user_context.macro_user_id, document_id, AccessLevel::Edit, expiry ~ now + 3600
and the state.document_permission_jwt_secret) and propagate/map errors the same
way; remove the manual jsonwebtoken::encode usage and ensure the returned
token/error handling uses encode_permission_token instead.
rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs (1)

21-23: 💤 Low value

Consider a more explicit local-mode placeholder.

Returning the literal string "fake" could silently break if a caller attempts to parse or validate the URL structure without checking is_local_aws() first. A more self-documenting sentinel (e.g., "local://fake-presigned-url") or an early anyhow::bail! would make the local-mode contract clearer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs`
around lines 21 - 23, The current local-mode branch returns the literal "fake"
which is ambiguous; update the branch in internal_presigned_helpers.rs (the code
that currently does return Ok("fake".to_string())) to return a clear sentinel
like "local://fake-presigned-url" (so callers that parse URLs will see a valid
scheme) or, if you prefer failing fast, replace it with an anyhow::bail! with a
descriptive message indicating presigned URLs are unavailable in local mode;
ensure the change is applied where the local-mode return is implemented so
callers relying on URL parsing/validation behave correctly.
rust/cloud-storage/s3_key/src/document_key.rs (1)

9-11: 💤 Low value

Snapshot keys are not covered by the DocumentKey enum.

The new snapshot key format {SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id} lives in the same S3 bucket as other document keys, but DocumentKey::from_s3_key (line 64) cannot parse it. A 2-segment key starting with "sync_service_snapshot" would fall through to the error case (line 104) since it doesn't match TEMP_FILE_PREFIX.

Current usage never parses snapshot keys back from S3 (they're only constructed via the snapshot_key helper in the adapter), so this doesn't break anything today. However, future code that enumerates bucket keys and expects to parse them all via DocumentKey will silently fail on snapshots.

🔧 Optional: Extend `DocumentKey` to handle snapshots

Add a Snapshot variant to the enum:

 pub enum DocumentKey {
     // ... existing variants ...
+    /// A sync-service CRDT snapshot: `sync_service_snapshot/{document_id}`
+    Snapshot {
+        /// The document ID.
+        document_id: String,
+    },
 }

And update from_s3_key to recognize it:

         match split.len() {
-            2 if split[0] == TEMP_FILE_PREFIX => {
+            2 if split[0] == TEMP_FILE_PREFIX => { /* existing temp docx logic */ }
+            2 if split[0] == SYNC_SERVICE_SNAPSHOT_PREFIX => {
+                Ok(Self::Snapshot { document_id: split[1].to_string() })
+            }
             // ...
         }

Then update document_id(), to_key(), and add an is_snapshot() predicate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/s3_key/src/document_key.rs` around lines 9 - 11, Add
support for snapshot keys by extending the DocumentKey enum with a Snapshot
variant and updating parsing/serialization helpers: modify DocumentKey to
include Snapshot(document_id: String), update DocumentKey::from_s3_key to
recognize a 2-segment key whose first segment equals
SYNC_SERVICE_SNAPSHOT_PREFIX and return Snapshot(id), update DocumentKey::to_key
(or wherever keys are serialized) to emit
"{SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id}" for Snapshot, ensure
DocumentKey::document_id() returns the inner id for Snapshot, and add an
is_snapshot() predicate; reference SYNC_SERVICE_SNAPSHOT_PREFIX, DocumentKey,
DocumentKey::from_s3_key, DocumentKey::to_key, and DocumentKey::document_id().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@js/app/packages/block-md/component/Block.tsx`:
- Around line 84-122: The race condition occurs because gotDoSnapshot is set to
true only after calling loroManager.initializeFromSnapshot for the DO snapshot,
allowing the concurrent storageServiceClient.fetchCachedSnapshot path to also
call initializeFromSnapshot; to fix, set gotDoSnapshot = true immediately before
calling loroManager.initializeFromSnapshot in the DO sync path (i.e., set the
flag right after successful syncResult and before invoking
loroManager.initializeFromSnapshot), so the IIFE fetching cached snapshots will
bail out and not attempt initialization; keep the rest of the error handling and
returns unchanged.

In `@js/app/packages/service-clients/service-storage/client.ts`:
- Around line 1111-1115: The dssFetch generic is currently constrained to
Record<string, any>, which prevents calls like fetchCachedSnapshot using
dssFetch<Uint8Array>; update dssFetch's type signature and overloads to allow
binary responses (Uint8Array) similar to safeFetch/fetchWithToken: detect
response.content-type octet-stream and return new Uint8Array(await
response.arrayBuffer()) when T = Uint8Array, or otherwise keep the existing
JSON/Record behavior; adjust dssFetch's generic constraint to accept Uint8Array
| Record<string, any> (or make it unconstrained) and mirror the same runtime
branch used in safeFetch/fetchWithToken so fetchCachedSnapshot can call
dssFetch<Uint8Array>(...) without a type error.

In `@rust/cloud-storage/documents/src/domain/permission_token.rs`:
- Around line 18-21: The code in encode_permission_token uses
duration_since(UNIX_EPOCH).unwrap(), which can panic if system time is before
the epoch; replace the unwrap with proper error handling and propagate a Result
instead of panicking. Change the calculation of now to use
duration_since(UNIX_EPOCH).map_err(|e| /* map to your function's error type */)?
and then .as_secs() as usize (or match on the Result and return Err on failure)
so encode_permission_token returns an error instead of crashing; update the
function signature to return Result if it does not already and map the
SystemTimeError to the module's error type.

In `@rust/cloud-storage/documents/src/outbound/s3_upload_url.rs`:
- Around line 121-123: The early return in upload_snapshot (the is_local_aws()
guard in s3_upload_url.rs) causes local runs to report success without storing
bytes; remove that short-circuit and implement a local write path so snapshots
are actually persisted in non-AWS mode. Concretely, modify the upload_snapshot
function to no longer return Ok(()) when macro_aws_config::is_local_aws() is
true; instead branch to the same local storage implementation you use for tests
or write the snapshot bytes to the configured local storage location (or call
the existing local_store/local_write helper if one exists) so callers receive a
true write result and snapshot retrieval works in local mode. Ensure the change
touches the macro_aws_config::is_local_aws() check and the upload_snapshot
function logic only.

In `@rust/cloud-storage/github/src/domain/service/sync/test.rs`:
- Around line 261-267: The test stub implementations for get_snapshot and
upload_snapshot currently call unimplemented!() which can cause panics; change
get_snapshot in the test stub to return an inert Ok(None) and change
upload_snapshot to return Ok(()) so the trait methods are satisfied without
panicking (update the async fn get_snapshot and async fn upload_snapshot
implementations in the test DocumentService stub).

In `@rust/sync-service/wrangler.toml`:
- Around line 35-36: The DSS_INTERNAL_AUTH_KEY is hardcoded to "local" in the
default [vars] and is inherited by all envs; remove the hardcoded secret from
the default [vars] and instead add environment-specific overrides (e.g., in
[env.playground.vars], [env.dev.vars], [env.prod.vars]) to read the secret from
Wrangler/GCP/AWS secrets or from environment variables, ensuring the key names
match the DOCUMENT_STORAGE_SERVICE_AUTH_KEY used by DSS; update
DSS_INTERNAL_AUTH_KEY in each env block to reference the appropriate secret
rather than a literal "local".
- Line 59: The DSS_INTERNAL_AUTH_KEY is set to "local" globally in wrangler.toml
but must match document-storage’s DOCUMENT_STORAGE_SERVICE_AUTH_KEY so internal
requests (sent with header x-document-storage-service-auth-key in
rust/sync-service/src/dss_internal.rs) are accepted; update wrangler.toml to
provide DSS_INTERNAL_AUTH_KEY per environment (env.playground, env.dev,
env.prod) or change your deployment wiring to inject the same secret used for
document_storage_service_auth_key (the secret mapped to
DOCUMENT_STORAGE_SERVICE_AUTH_KEY in document_storage_service's
internal_access.rs) so the values match across services.

---

Nitpick comments:
In
`@rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs`:
- Around line 21-23: The current local-mode branch returns the literal "fake"
which is ambiguous; update the branch in internal_presigned_helpers.rs (the code
that currently does return Ok("fake".to_string())) to return a clear sentinel
like "local://fake-presigned-url" (so callers that parse URLs will see a valid
scheme) or, if you prefer failing fast, replace it with an anyhow::bail! with a
descriptive message indicating presigned URLs are unavailable in local mode;
ensure the change is applied where the local-mode return is implemented so
callers relying on URL parsing/validation behave correctly.

In `@rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs`:
- Around line 94-114: The JWT is being built inline in create_task.rs (using
jsonwebtoken::encode and DocumentPermissionsToken) which duplicates logic;
replace that block with a call to the existing encode_permission_token helper
(the same function used by create_markdown.rs), passing the same inputs
(user_context.macro_user_id, document_id, AccessLevel::Edit, expiry ~ now + 3600
and the state.document_permission_jwt_secret) and propagate/map errors the same
way; remove the manual jsonwebtoken::encode usage and ensure the returned
token/error handling uses encode_permission_token instead.

In `@rust/cloud-storage/s3_key/src/document_key.rs`:
- Around line 9-11: Add support for snapshot keys by extending the DocumentKey
enum with a Snapshot variant and updating parsing/serialization helpers: modify
DocumentKey to include Snapshot(document_id: String), update
DocumentKey::from_s3_key to recognize a 2-segment key whose first segment equals
SYNC_SERVICE_SNAPSHOT_PREFIX and return Snapshot(id), update DocumentKey::to_key
(or wherever keys are serialized) to emit
"{SYNC_SERVICE_SNAPSHOT_PREFIX}/{document_id}" for Snapshot, ensure
DocumentKey::document_id() returns the inner id for Snapshot, and add an
is_snapshot() predicate; reference SYNC_SERVICE_SNAPSHOT_PREFIX, DocumentKey,
DocumentKey::from_s3_key, DocumentKey::to_key, and DocumentKey::document_id().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5dcf7fb0-dd63-492c-ad08-e7f733a63681

📥 Commits

Reviewing files that changed from the base of the PR and between 210482e and 66ad298.

⛔ Files ignored due to path filters (7)
  • js/app/packages/app/public/markdown-golden.1.bin is excluded by !**/*.bin
  • js/app/packages/service-clients/service-storage/generated/schemas/createMarkdownDocumentResponse.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createMarkdownHandler200.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createTaskHandler200.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/createTaskResponse.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/zod.ts is excluded by !**/generated/**
  • rust/cloud-storage/Cargo.lock is excluded by !**/*.lock, !**/Cargo.lock
📒 Files selected for processing (41)
  • docker-compose.yml
  • js/app/packages/app/index.html
  • js/app/packages/block-md/component/Block.tsx
  • js/app/packages/block-md/definition.ts
  • js/app/packages/block-md/util/mdCreateCache.ts
  • js/app/packages/core/component/LexicalMarkdown/collaboration/undo.ts
  • js/app/packages/core/util/create.ts
  • js/app/packages/core/util/fetchWithToken.ts
  • js/app/packages/core/util/safeFetch.ts
  • js/app/packages/service-clients/service-storage/client.ts
  • js/app/packages/service-clients/service-storage/openapi.json
  • js/lexical-core/markdown-golden.ts
  • js/lexical-core/markdown-loro-schema.ts
  • js/lexical-core/scripts/generate-golden.ts
  • rust/cloud-storage/document_storage_service/src/api/documents/mod.rs
  • rust/cloud-storage/document_storage_service/src/api/internal/mod.rs
  • rust/cloud-storage/document_storage_service/src/main.rs
  • rust/cloud-storage/document_storage_service/src/service/s3/internal_presigned_helpers.rs
  • rust/cloud-storage/document_storage_service/src/service/s3/mod.rs
  • rust/cloud-storage/documents/Cargo.toml
  • rust/cloud-storage/documents/src/domain.rs
  • rust/cloud-storage/documents/src/domain/models.rs
  • rust/cloud-storage/documents/src/domain/permission_token.rs
  • rust/cloud-storage/documents/src/domain/ports/mod.rs
  • rust/cloud-storage/documents/src/domain/service.rs
  • rust/cloud-storage/documents/src/domain/service/tests.rs
  • rust/cloud-storage/documents/src/inbound/axum_router.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/create_task.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/get_cached_snapshot_url.rs
  • rust/cloud-storage/documents/src/inbound/axum_router/put_snapshot.rs
  • rust/cloud-storage/documents/src/outbound/s3_upload_url.rs
  • rust/cloud-storage/github/src/domain/service/sync/test.rs
  • rust/cloud-storage/s3_key/src/document_key.rs
  • rust/cloud-storage/s3_key/src/lib.rs
  • rust/sync-service/src/constants.rs
  • rust/sync-service/src/dss_internal.rs
  • rust/sync-service/src/durable_object.rs
  • rust/sync-service/src/lib.rs
  • rust/sync-service/wrangler.docker.toml
  • rust/sync-service/wrangler.toml
🛑 Comments failed to post (7)
js/app/packages/block-md/component/Block.tsx (1)

84-122: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Race condition: gotDoSnapshot flag is set after initializeFromSnapshot completes, not before.

The flag gotDoSnapshot is set to true at line 113, after initializeFromSnapshot returns. However, the check at line 92 runs in the concurrent IIFE. If the cached snapshot fetch completes while the DO's initializeFromSnapshot (line 109) is in progress but before line 113 executes, both paths may call initializeFromSnapshot concurrently or sequentially, potentially corrupting the Loro doc state.

Move gotDoSnapshot = true before the DO initialization to prevent the cached path from starting once we've committed to using the DO snapshot:

Proposed fix
     if (syncResult.isErr()) {
       console.error('Failed to receive initial sync', syncResult.error);
       return err('INVALID');
     }

+    gotDoSnapshot = true;
+
     const result = await loroManager.initializeFromSnapshot(
       syncResult.value.snapshot
     );

-    gotDoSnapshot = true;
-
     if (result.isErr()) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@js/app/packages/block-md/component/Block.tsx` around lines 84 - 122, The race
condition occurs because gotDoSnapshot is set to true only after calling
loroManager.initializeFromSnapshot for the DO snapshot, allowing the concurrent
storageServiceClient.fetchCachedSnapshot path to also call
initializeFromSnapshot; to fix, set gotDoSnapshot = true immediately before
calling loroManager.initializeFromSnapshot in the DO sync path (i.e., set the
flag right after successful syncResult and before invoking
loroManager.initializeFromSnapshot), so the IIFE fetching cached snapshots will
bail out and not attempt initialization; keep the rest of the error handling and
returns unchanged.
js/app/packages/service-clients/service-storage/client.ts (1)

1111-1115: ⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "export function dssFetch<|fetchCachedSnapshot|dssFetch<Uint8Array>" js/app/packages/service-clients/service-storage/client.ts
bun run check

Repository: macro-inc/macro

Length of output: 371


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="js/app/packages/service-clients/service-storage/client.ts"

# Show relevant context around the call site
echo "---- fetchCachedSnapshot context ----"
sed -n '1090,1140p' "$FILE" | nl -ba -w2 -s':'

# Show relevant context around dssFetch overloads
echo "---- dssFetch context ----"
sed -n '120,190p' "$FILE" | nl -ba -w2 -s':'

# Search for other dssFetch overloads/mentions of binary-like types
echo "---- dssFetch overloads/binary mentions ----"
rg -n "dssFetch\s*<|function\s+dssFetch|Uint8Array|ArrayBuffer|Blob|buffer|cached_snapshot_url" "$FILE"

Repository: macro-inc/macro

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="js/app/packages/service-clients/service-storage/client.ts"

echo "---- file stats ----"
wc -l "$FILE" || true

echo "---- locate fetchCachedSnapshot ----"
rg -n "fetchCachedSnapshot\s*\(" "$FILE" || true

echo "---- locate dssFetch overload(s) ----"
rg -n "export function dssFetch" "$FILE" || true

echo "---- print around fetchCachedSnapshot (with enough context) ----"
# If the file is shorter than the ranges, sed just outputs less/no lines; that's okay.
sed -n '1,140p' "$FILE" | head -n 140

# Print windows around the known approximate line from prior rg results
sed -n '1080,1135p' "$FILE" || true
sed -n '120,175p' "$FILE" || true

echo "---- extract full dssFetch signatures block (best-effort) ----"
# Print from first occurrence to shortly after next function export/closing brace
START=$(rg -n "export function dssFetch" "$FILE" | head -n1 | cut -d: -f1)
if [[ -n "${START:-}" ]]; then
  END=$((START+80))
  sed -n "${START},${END}p" "$FILE" || true
fi

Repository: macro-inc/macro

Length of output: 15330


🏁 Script executed:

#!/bin/bash
set -euo pipefail
RG_TARGET="js/app/packages"
rg -n "function fetchWithToken|export function fetchWithToken|fetchWithToken<" "$RG_TARGET" | head -n 50

Repository: macro-inc/macro

Length of output: 2789


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="js/app/packages/core/util/fetchWithToken.ts"

# Show the relevant type signature area
rg -n "export async function fetchWithToken<|export function fetchWithToken<|async function fetchWithToken" "$FILE"
START=$(rg -n "export async function fetchWithToken<" "$FILE" | head -n1 | cut -d: -f1 || true)
if [[ -n "${START:-}" ]]; then
  END=$((START+120))
  sed -n "${START},${END}p" "$FILE" | head -n 160
fi

echo "---- body parsing clues ----"
rg -n "arrayBuffer|blob\\(|json\\(|text\\(|Uint8Array|ArrayBuffer" "$FILE" || true

Repository: macro-inc/macro

Length of output: 1823


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="js/app/packages/core/util/fetchWithToken.ts"

echo "---- locate fetchWithCredentials ----"
rg -n "fetchWithCredentials|fetchWithAuth" "$FILE"

# Print around fetchWithCredentials
START=$(rg -n "function fetchWithCredentials|async function fetchWithCredentials" "$FILE" | head -n1 | cut -d: -f1 || true)
if [[ -n "${START:-}" ]]; then
  END=$((START+160))
  sed -n "${START},${END}p" "$FILE"
fi

echo "---- locate Uint8Array body handling ----"
rg -n "Uint8Array|arrayBuffer|res\\.arrayBuffer|res\\.blob|blob\\(" "$FILE" || true

Repository: macro-inc/macro

Length of output: 4953


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="js/app/packages/core/util/fetchWithToken.ts"

rg -n "ObjectLike" "$FILE"

# Also inspect imports at the top for where ObjectLike is defined
sed -n '1,120p' "$FILE"

Repository: macro-inc/macro

Length of output: 3950


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# ObjectLike definition
rg -n "export type ObjectLike|type ObjectLike" js/app/packages/core/util/result.ts js/app/packages/core/util -S || true

FILE1="js/app/packages/core/util/result.ts"
if [[ -f "$FILE1" ]]; then
  START=$(rg -n "ObjectLike" "$FILE1" | head -n1 | cut -d: -f1)
  START=${START:-1}
  sed -n "$((START-10)),$((START+60))p" "$FILE1" | head -n 120
fi

# safeFetch type + body parsing
FILE2="js/app/packages/core/util/safeFetch.ts"
if [[ -f "$FILE2" ]]; then
  rg -n "export (async )?function safeFetch|function safeFetch|safeFetch<" "$FILE2"
  START=$(rg -n "safeFetch<" "$FILE2" | head -n1 | cut -d: -f1 || true)
  if [[ -z "${START:-}" ]]; then
    START=$(rg -n "function safeFetch" "$FILE2" | head -n1 | cut -d: -f1 || true)
  fi
  if [[ -n "${START:-}" ]]; then
    sed -n "$((START-30)),$((START+170))p" "$FILE2" | head -n 220
  fi

  echo "---- Uint8Array/arrayBuffer handling in safeFetch ----"
  rg -n "Uint8Array|arrayBuffer|res\\.arrayBuffer|application/octet-stream|json\\(|text\\(" "$FILE2" || true
fi

Repository: macro-inc/macro

Length of output: 8975


Fix dssFetch typing to allow Uint8Array responses.

fetchCachedSnapshot calls dssFetch<Uint8Array>(...), but dssFetch overloads currently constrain T to Record<string, any>—so this call is a TypeScript generic constraint mismatch. safeFetch/fetchWithToken already support Uint8Array (octet-stream → new Uint8Array(await response.arrayBuffer())), so dssFetch should mirror that.

Suggested fix
-export function dssFetch<T extends Record<string, any>>(
+export function dssFetch<T extends Record<string, unknown> | Uint8Array>(
   url: string,
   init?: SafeFetchInit
 ): Promise<Result<T, ResultError<FetchWithTokenErrorCode>[]>>;
-export function dssFetch<T extends Record<string, any> = never>(
+export function dssFetch<
+  T extends Record<string, unknown> | Uint8Array = never
+>(
   url: string,
   init?: SafeFetchInit
 ):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@js/app/packages/service-clients/service-storage/client.ts` around lines 1111
- 1115, The dssFetch generic is currently constrained to Record<string, any>,
which prevents calls like fetchCachedSnapshot using dssFetch<Uint8Array>; update
dssFetch's type signature and overloads to allow binary responses (Uint8Array)
similar to safeFetch/fetchWithToken: detect response.content-type octet-stream
and return new Uint8Array(await response.arrayBuffer()) when T = Uint8Array, or
otherwise keep the existing JSON/Record behavior; adjust dssFetch's generic
constraint to accept Uint8Array | Record<string, any> (or make it unconstrained)
and mirror the same runtime branch used in safeFetch/fetchWithToken so
fetchCachedSnapshot can call dssFetch<Uint8Array>(...) without a type error.
rust/cloud-storage/documents/src/domain/permission_token.rs (1)

18-21: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "encode_permission_token|duration_since\\(UNIX_EPOCH\\)|\\.unwrap\\(" rust/cloud-storage/documents/src/domain/permission_token.rs
rg -n "encode_permission_token\\(" rust/cloud-storage/documents/src

Repository: macro-inc/macro

Length of output: 365


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the token encoder implementation and its error types
nl -ba rust/cloud-storage/documents/src/domain/permission_token.rs | sed -n '1,120p'

# Inspect the call site(s) inside the documents service
nl -ba rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs | sed -n '40,120p'

# Confirm whether any other `unwrap()` exists in the token generation path
rg -n "encode_permission_token|duration_since\\(UNIX_EPOCH\\)|\\.unwrap\\(" rust/cloud-storage/documents/src/domain/permission_token.rs rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs

Repository: macro-inc/macro

Length of output: 102


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="rust/cloud-storage/documents/src/domain/permission_token.rs"
awk '{printf "%6d: %s\n", NR, $0}' "$file" | sed -n '1,160p'

caller="rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs"
awk '{printf "%6d: %s\n", NR, $0}' "$caller" | sed -n '1,160p'

# Find all unwraps in the token path area (encoder + its call sites)
rg -n "encode_permission_token|duration_since\\(UNIX_EPOCH\\)|\\.unwrap\\(" rust/cloud-storage/documents/src/domain/permission_token.rs rust/cloud-storage/documents/src/inbound/axum_router/create_markdown.rs

Repository: macro-inc/macro

Length of output: 5834


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "jsonwebtoken::errors::ErrorKind|ErrorKind::InvalidToken|InvalidToken" rust/cloud-storage/documents rust/cloud-storage -S
rg -n "encode_permission_token\\(" rust/cloud-storage/documents/src -S

Repository: macro-inc/macro

Length of output: 1065


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect existing error mapping patterns in JWT decode/validation
sed -n '150,270p' rust/cloud-storage/macro_auth/src/middleware/decode_jwt.rs

sed -n '40,120p' rust/cloud-storage/document_storage_service/src/api/documents/permissions_token/validate_permissions_token.rs

sed -n '70,130p' rust/cloud-storage/gmail_client/src/auth.rs

# Also inspect current permission_token.rs again for context around return type
sed -n '1,120p' rust/cloud-storage/documents/src/domain/permission_token.rs

Repository: macro-inc/macro

Length of output: 8990


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "ErrorKind::InvalidToken|InvalidToken" rust/cloud-storage -S

Repository: macro-inc/macro

Length of output: 41


🌐 Web query:

jsonwebtoken crate jsonwebtoken::errors::ErrorKind variants InvalidToken construction from ErrorKind Error::from

💡 Result:

In the jsonwebtoken crate, ErrorKind is an enum that defines the specific types of errors encountered during JWT operations [1][2]. The InvalidToken variant is explicitly defined as a member of this enum to represent cases where a token does not have a valid JWT structure [1][3][4]. Regarding the relationship between ErrorKind and the Error struct: 1. Construction from ErrorKind: The Error struct implements the From trait for ErrorKind [3][5]. This means you can create an Error instance directly from an ErrorKind variant using the standard From::from or the into method (e.g., Error::from(ErrorKind::InvalidToken) or ErrorKind::InvalidToken.into) [5]. 2. Alternative Constructor: The crate also provides a public function new_error(kind: ErrorKind) -> Error, which is primarily intended for use in custom cryptographic providers, though standard conversion via From is the idiomatic approach [3][2]. 3. Accessing the Kind: If you have an instance of Error, you can retrieve its underlying ErrorKind using the kind method (which returns &ErrorKind) or the into_kind method (which consumes the error and returns the ErrorKind) [5]. Because ErrorKind is marked with #[non_exhaustive], you should avoid exhaustive matching on it in your code to ensure compatibility with future versions of the crate [3][6][4].

Citations:


🌐 Web query:

jsonwebtoken errors ErrorKind variants list

💡 Result:

jsonwebtoken::errors::ErrorKind variants (from the latest docs for the crate): 1) InvalidToken 2) InvalidSignature 3) InvalidEcdsaKey 4) InvalidEddsaKey 5) InvalidRsaKey(String) 6) RsaFailedSigning 7) Signing(String) 8) InvalidAlgorithmName 9) InvalidKeyFormat 10) MissingRequiredClaim(String) 11) InvalidClaimFormat(String) 12) ExpiredSignature 13) InvalidIssuer 14) InvalidAudience 15) InvalidSubject 16) ImmatureSignature 17) InvalidAlgorithm 18) MissingAlgorithm 19) Base64(DecodeError) 20) Json(Arc) 21) Utf8(FromUtf8Error) 22) Provider(String) Note: ErrorKind is marked #[non_exhaustive], so additional variants may be added in future releases. [1][2]

Citations:


Remove panic in encode_permission_token clock calculation

duration_since(UNIX_EPOCH).unwrap() can panic if the system clock is earlier than the Unix epoch, causing the permission-token auth path to crash instead of returning an error.

Suggested fix
     let now = SystemTime::now()
         .duration_since(UNIX_EPOCH)
-        .unwrap()
+        .map_err(|_| {
+            jsonwebtoken::errors::Error::from(jsonwebtoken::errors::ErrorKind::InvalidToken)
+        })?
         .as_secs() as usize;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    let now = SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .map_err(|_| {
            jsonwebtoken::errors::Error::from(jsonwebtoken::errors::ErrorKind::InvalidToken)
        })?
        .as_secs() as usize;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/documents/src/domain/permission_token.rs` around lines 18
- 21, The code in encode_permission_token uses
duration_since(UNIX_EPOCH).unwrap(), which can panic if system time is before
the epoch; replace the unwrap with proper error handling and propagate a Result
instead of panicking. Change the calculation of now to use
duration_since(UNIX_EPOCH).map_err(|e| /* map to your function's error type */)?
and then .as_secs() as usize (or match on the Result and return Err on failure)
so encode_permission_token returns an error instead of crashing; update the
function signature to return Result if it does not already and map the
SystemTimeError to the module's error type.
rust/cloud-storage/documents/src/outbound/s3_upload_url.rs (1)

121-123: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the local-env no-op write path for snapshots.

Line 121 short-circuits upload_snapshot in local mode, so callers get success while bytes are never stored. That makes local snapshot retrieval permanently miss and masks integration behavior.

Proposed fix
     #[tracing::instrument(skip(self, bytes), err)]
     async fn upload_snapshot(&self, document_id: &str, bytes: Vec<u8>) -> anyhow::Result<()> {
-        if macro_aws_config::is_local_aws() {
-            return Ok(());
-        }
-
         let key = snapshot_key(document_id);
         self.client
             .put_object()
             .bucket(&self.document_storage_bucket)
             .key(&key)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

     #[tracing::instrument(skip(self, bytes), err)]
     async fn upload_snapshot(&self, document_id: &str, bytes: Vec<u8>) -> anyhow::Result<()> {
         let key = snapshot_key(document_id);
         self.client
             .put_object()
             .bucket(&self.document_storage_bucket)
             .key(&key)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/documents/src/outbound/s3_upload_url.rs` around lines 121
- 123, The early return in upload_snapshot (the is_local_aws() guard in
s3_upload_url.rs) causes local runs to report success without storing bytes;
remove that short-circuit and implement a local write path so snapshots are
actually persisted in non-AWS mode. Concretely, modify the upload_snapshot
function to no longer return Ok(()) when macro_aws_config::is_local_aws() is
true; instead branch to the same local storage implementation you use for tests
or write the snapshot bytes to the configured local storage location (or call
the existing local_store/local_write helper if one exists) so callers receive a
true write result and snapshot retrieval works in local mode. Ensure the change
touches the macro_aws_config::is_local_aws() check and the upload_snapshot
function logic only.
rust/cloud-storage/github/src/domain/service/sync/test.rs (1)

261-267: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid unimplemented!() in shared test stubs for new trait methods.

These methods now satisfy DocumentService, but leaving them as unimplemented!() can panic if any test path starts touching snapshot behavior. Return inert defaults to keep failures signal-based instead of panic-based.

Proposed fix
     async fn get_snapshot(&self, _document_id: &str) -> anyhow::Result<Option<Vec<u8>>> {
-        unimplemented!()
+        Ok(None)
     }

     async fn upload_snapshot(&self, _document_id: &str, _bytes: Vec<u8>) -> anyhow::Result<()> {
-        unimplemented!()
+        Ok(())
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    async fn get_snapshot(&self, _document_id: &str) -> anyhow::Result<Option<Vec<u8>>> {
        Ok(None)
    }

    async fn upload_snapshot(&self, _document_id: &str, _bytes: Vec<u8>) -> anyhow::Result<()> {
        Ok(())
    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cloud-storage/github/src/domain/service/sync/test.rs` around lines 261 -
267, The test stub implementations for get_snapshot and upload_snapshot
currently call unimplemented!() which can cause panics; change get_snapshot in
the test stub to return an inert Ok(None) and change upload_snapshot to return
Ok(()) so the trait methods are satisfied without panicking (update the async fn
get_snapshot and async fn upload_snapshot implementations in the test
DocumentService stub).
rust/sync-service/wrangler.toml (2)

35-36: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Hardcoded auth key "local" is inherited by all environments.

The DSS_INTERNAL_AUTH_KEY = "local" is set in the default [vars] section and is inherited by playground, dev, and prod environments (lines 59, 97, 135) since they don't override it. Using a hardcoded "local" value in deployed environments creates a security vulnerability.

Each environment should use a proper secret value retrieved from Wrangler secrets or environment-specific configuration.

🔐 Recommended fix to use environment-specific secrets

For the playground environment, add to [env.playground.vars]:

 [env.playground.vars]
 INTERNAL_API_SECRET_KEY = "SYNC_SERVICE_KEY_PLAYGROUND"
 DOCUMENT_PERMISSIONS_SECRET = "local"
 ENVIRONMENT = "development"
 SPS_URL = "https://search-processing-dev.macro.com"
 DSS_URL = "https://document-storage-dev.macro.com"
+DSS_INTERNAL_AUTH_KEY = "DSS_INTERNAL_KEY_PLAYGROUND"

Similarly for env.dev.vars:

 [env.dev.vars]
 INTERNAL_API_SECRET_KEY = "SYNC_SERVICE_KEY_DEV"
 ENVIRONMENT = "development"
 SPS_URL = "https://search-processing-dev.macro.com"
 DSS_URL = "https://document-storage-dev.macro.com"
+DSS_INTERNAL_AUTH_KEY = "DSS_INTERNAL_KEY_DEV"

And for env.prod.vars:

 [env.prod.vars]
 INTERNAL_API_SECRET_KEY = "SYNC_SERVICE_KEY_PROD"
 ENVIRONMENT = "production"
 SPS_URL = "https://search-processing.macro.com"
 DSS_URL = "https://document-storage.macro.com"
+DSS_INTERNAL_AUTH_KEY = "DSS_INTERNAL_KEY_PROD"

Ensure these keys match the corresponding DOCUMENT_STORAGE_SERVICE_AUTH_KEY configured on the DSS side for each environment.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/sync-service/wrangler.toml` around lines 35 - 36, The
DSS_INTERNAL_AUTH_KEY is hardcoded to "local" in the default [vars] and is
inherited by all envs; remove the hardcoded secret from the default [vars] and
instead add environment-specific overrides (e.g., in [env.playground.vars],
[env.dev.vars], [env.prod.vars]) to read the secret from Wrangler/GCP/AWS
secrets or from environment variables, ensuring the key names match the
DOCUMENT_STORAGE_SERVICE_AUTH_KEY used by DSS; update DSS_INTERNAL_AUTH_KEY in
each env block to reference the appropriate secret rather than a literal
"local".

59-59: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if DSS_INTERNAL_AUTH_KEY is configured in deployed environments

echo "=== Sync Service Wrangler Config ==="
echo "Playground DSS_INTERNAL_AUTH_KEY:"
grep -A 10 "\[env.playground.vars\]" rust/sync-service/wrangler.toml | grep "DSS_INTERNAL_AUTH_KEY" || echo "  (inherits from default: 'local')"

echo -e "\nDev DSS_INTERNAL_AUTH_KEY:"
grep -A 10 "\[env.dev.vars\]" rust/sync-service/wrangler.toml | grep "DSS_INTERNAL_AUTH_KEY" || echo "  (inherits from default: 'local')"

echo -e "\nProd DSS_INTERNAL_AUTH_KEY:"
grep -A 10 "\[env.prod.vars\]" rust/sync-service/wrangler.toml | grep "DSS_INTERNAL_AUTH_KEY" || echo "  (inherits from default: 'local')"

Repository: macro-inc/macro

Length of output: 292


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="rust/sync-service/wrangler.toml"

echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY ----"
rg -n "DSS_INTERNAL_AUTH_KEY" "$FILE" || true

echo
echo "---- Occurrences of DOCUMENT_STORAGE_SERVICE_AUTH_KEY ----"
rg -n "DOCUMENT_STORAGE_SERVICE_AUTH_KEY" "$FILE" || true

echo
echo "---- Context around DSS_INTERNAL_AUTH_KEY ----"
rg -n --context 3 "DSS_INTERNAL_AUTH_KEY" "$FILE" || true

Repository: macro-inc/macro

Length of output: 484


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="rust/sync-service/wrangler.toml"

echo "---- Occurrences of DSS_URL in wrangler.toml (with context) ----"
rg -n --context 2 "DSS_URL" "$FILE" || true

echo
echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY in repository ----"
rg -n "DSS_INTERNAL_AUTH_KEY" . || true

echo
echo "---- Occurrences of DOCUMENT_STORAGE_SERVICE_AUTH_KEY in repository ----"
rg -n "DOCUMENT_STORAGE_SERVICE_AUTH_KEY" . || true

Repository: macro-inc/macro

Length of output: 2398


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- wrangler.toml: lines 1-170 ----"
nl -ba rust/sync-service/wrangler.toml | sed -n '1,170p'

echo
echo "---- wrangler.docker.toml: lines 1-120 ----"
nl -ba rust/sync-service/wrangler.docker.toml | sed -n '1,120p'

echo
echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY in infra ----"
rg -n "DSS_INTERNAL_AUTH_KEY" infra || true

echo
echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY in rust sync-service infra/deploy scripts ----"
rg -n "DSS_INTERNAL_AUTH_KEY" rust/infra rust sync-service infra/stacks || true

Repository: macro-inc/macro

Length of output: 157


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- wrangler.toml: lines 1-170 ----"
cat -n rust/sync-service/wrangler.toml | sed -n '1,170p'

echo
echo "---- wrangler.docker.toml: lines 1-120 ----"
cat -n rust/sync-service/wrangler.docker.toml | sed -n '1,120p'

echo
echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY in infra ----"
rg -n "DSS_INTERNAL_AUTH_KEY" infra rust/infra rust/sync-service  || true

Repository: macro-inc/macro

Length of output: 9871


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- rust/sync-service/src/dss_internal.rs (0-200) ----"
sed -n '1,200p' rust/sync-service/src/dss_internal.rs

echo
echo "---- rust/cloud-storage/search_upload_handler/src/main.rs (0-200) ----"
sed -n '1,220p' rust/cloud-storage/search_upload_handler/src/main.rs

Repository: macro-inc/macro

Length of output: 3619


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Header constant definition locations ----"
rg -n "MACRO_DOCUMENT_STORAGE_SERVICE_AUTH_HEADER_KEY" rust || true

echo
echo "---- Search for internal/documents snapshot handler ----"
rg -n "/internal/documents" rust || true

echo
echo "---- Where DOCUMENT_STORAGE_SERVICE_AUTH_KEY is used (besides search_upload_handler) ----"
rg -n "DOCUMENT_STORAGE_SERVICE_AUTH_KEY" rust || true

Repository: macro-inc/macro

Length of output: 4147


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- document_storage_service header constant (api/mod.rs) ----"
sed -n '1,120p' rust/cloud-storage/document_storage_service/src/api/mod.rs

echo
echo "---- internal_access middleware (0-220) ----"
sed -n '1,220p' rust/cloud-storage/document_storage_service/src/api/middleware/internal_access.rs

Repository: macro-inc/macro

Length of output: 7006


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- DocumentStorageServiceAuthKey definitions ----"
rg -n "DocumentStorageServiceAuthKey" rust/cloud-storage/document_storage_service/src || true

echo
echo "---- DOCUMENT_STORAGE_SERVICE_AUTH_KEY in document_storage_service ----"
rg -n "DOCUMENT_STORAGE_SERVICE_AUTH_KEY" rust/cloud-storage/document_storage_service/src || true

Repository: macro-inc/macro

Length of output: 1153


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- document_storage_service main.rs around dss_auth_key (180-260) ----"
sed -n '180,260p' rust/cloud-storage/document_storage_service/src/main.rs

echo
echo "---- document_storage_service api/context.rs around DocumentStorageServiceAuthKey (300-420) ----"
sed -n '300,420p' rust/cloud-storage/document_storage_service/src/api/context.rs

Repository: macro-inc/macro

Length of output: 5729


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Find env_var! macro definition ----"
rg -n "macro_rules!\s*env_var|env_var!\s*\{" -S rust | head -n 50

echo
echo "---- Find DocumentStorageServiceAuthKey::new implementation/usages ----"
rg -n "DocumentStorageServiceAuthKey::new" -S rust/cloud-storage/document_storage_service/src rust | true

echo
echo "---- Search for document_storage_service auth key env var names near DocumentStorageServiceAuthKey ----"
rg -n "DOCUMENT_STORAGE_SERVICE.*AUTH|DSS.*AUTH" rust/cloud-storage/document_storage_service/src || true

Repository: macro-inc/macro

Length of output: 5439


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- DocumentStorageServiceAuthKey env_var definition (api/context.rs) ----"
sed -n '320,380p' rust/cloud-storage/document_storage_service/src/api/context.rs

echo
echo "---- macro_env_var::env_var macro implementation (macro_env_var/src/lib.rs) ----"
sed -n '1,220p' rust/cloud-storage/macro_env_var/src/lib.rs

echo
echo "---- Search for DocumentStorageServiceAuthKey in macro_env_var/lib.rs ----"
rg -n "DocumentStorageServiceAuthKey" rust/cloud-storage/macro_env_var/src/lib.rs || true

echo
echo "---- Search in document_storage_service/src/config.rs for auth key ----"
rg -n "auth.*key|AuthKey|SERVICE_AUTH" rust/cloud-storage/document_storage_service/src/config.rs || true

Repository: macro-inc/macro

Length of output: 9857


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- infra references to DOCUMENT_STORAGE_SERVICE_AUTH_KEY ----"
rg -n "DOCUMENT_STORAGE_SERVICE_AUTH_KEY" infra rust | head -n 100 || true

echo
echo "---- infra references to DSS_INTERNAL_AUTH_KEY ----"
rg -n "DSS_INTERNAL_AUTH_KEY" infra rust | head -n 100 || true

Repository: macro-inc/macro

Length of output: 1580


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Search infra for sync-service stack/config ----"
rg -n "sync-service|sync_service2|wrangler.toml" infra/stacks infra/packages -S || true

echo
echo "---- Inspect infra/stacks/cloud-storage-service/index.ts around DOCUMENT_STORAGE_SERVICE_AUTH_KEY ----"
sed -n '1,120p' infra/stacks/cloud-storage-service/index.ts
echo "..."
sed -n '480,560p' infra/stacks/cloud-storage-service/index.ts

echo
echo "---- Inspect infra/stacks/search-upload/index.ts around DOCUMENT_STORAGE_SERVICE_AUTH_KEY ----"
sed -n '1,120p' infra/stacks/search-upload/index.ts
echo "..."
sed -n '1,80p' infra/stacks/search-upload/search-upload-lambda.ts

Repository: macro-inc/macro

Length of output: 12205


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Occurrences of DSS_INTERNAL_AUTH_KEY across repo ----"
rg -n "DSS_INTERNAL_AUTH_KEY" . || true

echo
echo "---- Search for sync-service deployments in infra/stacks (quick scan) ----"
rg -n "sync-service|sync_service" infra/stacks infra/packages/shared/src/service_urls.ts infra/packages/shared/src/ai_tools.ts || true

Repository: macro-inc/macro

Length of output: 2030


Ensure sync-service DSS_INTERNAL_AUTH_KEY matches document-storage’s DOCUMENT_STORAGE_SERVICE_AUTH_KEY (internal snapshot auth).

  • sync-service sends x-document-storage-service-auth-key using DSS_INTERNAL_AUTH_KEY (rust/sync-service/src/dss_internal.rs).
  • document-storage rejects internal requests unless that header value equals DocumentStorageServiceAuthKey, which is sourced from DOCUMENT_STORAGE_SERVICE_AUTH_KEY (rust/cloud-storage/document_storage_service/src/api/middleware/internal_access.rs + DocumentStorageServiceAuthKey env_var mapping).
  • rust/sync-service/wrangler.toml sets DSS_INTERNAL_AUTH_KEY = "local" only in global [vars]; env.playground, env.dev, and env.prod override DSS_URL but do not override DSS_INTERNAL_AUTH_KEY, so all envs use "local".
  • infra injects DOCUMENT_STORAGE_SERVICE_AUTH_KEY into document-storage-service from secrets manager per stack (infra/stacks/cloud-storage-service/index.ts), so it will likely not be "local".

Set DSS_INTERNAL_AUTH_KEY per environment (or via deployment wiring) to the same secret value as document_storage_service_auth_key used for DOCUMENT_STORAGE_SERVICE_AUTH_KEY in document-storage-service.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/sync-service/wrangler.toml` at line 59, The DSS_INTERNAL_AUTH_KEY is set
to "local" globally in wrangler.toml but must match document-storage’s
DOCUMENT_STORAGE_SERVICE_AUTH_KEY so internal requests (sent with header
x-document-storage-service-auth-key in rust/sync-service/src/dss_internal.rs)
are accepted; update wrangler.toml to provide DSS_INTERNAL_AUTH_KEY per
environment (env.playground, env.dev, env.prod) or change your deployment wiring
to inject the same secret used for document_storage_service_auth_key (the secret
mapped to DOCUMENT_STORAGE_SERVICE_AUTH_KEY in document_storage_service's
internal_access.rs) so the values match across services.

Comment on lines +1 to +34
//! Helpers for issuing document permission tokens (signed JWTs) used by the
//! sync service to authorize document access.

use model::document::DocumentPermissionsToken;
use models_permissions::share_permission::access_level::AccessLevel;
use std::time::{SystemTime, UNIX_EPOCH};

/// Token lifetime in seconds (1 hour).
const TOKEN_TTL_SECS: usize = 3600;

/// Sign a document permission token for the given user and document.
pub fn encode_permission_token(
user_id: Option<String>,
document_id: String,
access_level: AccessLevel,
jwt_secret: &str,
) -> Result<String, jsonwebtoken::errors::Error> {
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs() as usize;

jsonwebtoken::encode(
&jsonwebtoken::Header::new(jsonwebtoken::Algorithm::HS256),
&DocumentPermissionsToken {
user_id,
document_id,
access_level,
exp: now + TOKEN_TTL_SECS,
iss: "document_storage_service".to_string(),
},
&jsonwebtoken::EncodingKey::from_secret(jwt_secret.as_bytes()),
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic must exist somewhere for the existing document permissions call. Either use the existing method or pull them out into a util or something such that they can be shared.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was inline previously so I pulled it out here

Comment on lines +43 to +96
let token: string;
let documentMetadata: DocumentMetadata;
let userAccessLevel: AccessLevel;

if (pre) {
token = pre.token;
documentMetadata = pre.documentMetadata;
userAccessLevel = pre.userAccessLevel;
} else {
const [maybeDocument, maybeLocation, maybeToken] = await Promise.all([
loadResult(storageServiceClient.getDocumentMetadata({ documentId })),
loadResult(storageServiceClient.getDocumentLocation({ documentId })),
storageServiceClient.permissionsTokens.createPermissionToken({
document_id: documentId,
}),
]);

const { token } = maybeToken.value;
if (maybeToken.isErr()) return LoadErrors.UNAUTHORIZED;
if (maybeDocument.isErr()) return err(maybeDocument.error);
if (maybeLocation.isErr()) return err(maybeLocation.error);

if (maybeDocument.isErr()) return err(maybeDocument.error);
if (maybeLocation.isErr()) return err(maybeLocation.error);
token = maybeToken.value.token;
const documentResult = maybeDocument.value;
documentMetadata = documentResult.documentMetadata;
userAccessLevel = documentResult.userAccessLevel;

const documentResult = maybeDocument.value;
const { documentMetadata, userAccessLevel } = documentResult;
let { data: location } = maybeLocation.value;
let { data: location } = maybeLocation.value;
if (
location.type === 'presignedUrl' &&
location.content.state === 'pending'
) {
location = await waitForDocumentSyncServiceReady({
documentId,
}).catch((error) => {
console.error(
'Failed waiting for markdown sync-service location',
error
);
return location;
});
}

if (
location.type === 'presignedUrl' &&
location.content.state === 'pending'
) {
location = await waitForDocumentSyncServiceReady({
documentId,
}).catch((error) => {
// Markdown initialization and lifecycle persistence are backend-owned.
// If a markdown document still resolves to object storage here, opening
// it would require a backend repair/backfill path rather than a frontend
// sync-service mutation that leaves DB content metadata inconsistent.
if (location.type !== 'syncServiceContent') {
console.error(
'Failed waiting for markdown sync-service location',
error
'Markdown document is not available in sync-service',
documentId,
location.content
);
return location;
});
}

// Markdown initialization and lifecycle persistence are backend-owned.
// If a markdown document still resolves to object storage here, opening
// it would require a backend repair/backfill path rather than a frontend
// sync-service mutation that leaves DB content metadata inconsistent.
if (location.type !== 'syncServiceContent') {
console.error(
'Markdown document is not available in sync-service',
documentId,
location.content
);
return LoadErrors.INVALID;
return LoadErrors.INVALID;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pull all of this out into a tanstack query, and expose a method to seed the cache. This way you don't need the hacky cache, you can let tanstack manage caching the result for this.

import { storageServiceClient } from '@service-storage/client';
import type { AccessLevel } from '@service-storage/generated/schemas/accessLevel';
import type { DocumentMetadata } from '@service-storage/generated/schemas/documentMetadata';
import { queryClient } from '../client';
import { documentLoadKeys } from './keys';

export type DocumentLoadBundle = {
  documentMetadata: DocumentMetadata;
  userAccessLevel: AccessLevel;
  token: string;
};

const STALE_TIME = 60 * 1000;
const GC_TIME = 10 * 60 * 1000;

async function fetchDocumentLoadBundle(
  documentId: string
): Promise<DocumentLoadBundle> {
  const [maybeDocument, maybeToken] = await Promise.all([
    storageServiceClient.getDocumentMetadata({ documentId }),
    storageServiceClient.permissionsTokens.createPermissionToken({
      document_id: documentId,
    }),
  ]);

  if (maybeToken.isErr()) throw new Error('UNAUTHORIZED');
  if (maybeDocument.isErr()) throw new Error('Failed to fetch document metadata');

  return {
    documentMetadata: maybeDocument.value.documentMetadata,
    userAccessLevel: maybeDocument.value.userAccessLevel,
    token: maybeToken.value.token,
  };
}

export function documentLoadQueryOptions(documentId: string) {
  return {
    queryKey: documentLoadKeys.bundle(documentId).queryKey,
    queryFn: () => fetchDocumentLoadBundle(documentId),
    staleTime: STALE_TIME,
    gcTime: GC_TIME,
  };
}

/** Seed the bundle from a create response so the loader serves it from cache. */
export function seedDocumentLoadBundle(
  documentId: string,
  bundle: DocumentLoadBundle
) {
  queryClient.setQueryData(documentLoadKeys.bundle(documentId).queryKey, bundle);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only caveat, is you might want to prevent caching when just opening a document. Perhaps by immediately invalidating it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants