Skip to content

feat: add embedded ms3t S3 listener backed by Forge#34

Draft
frrist wants to merge 3 commits intomainfrom
frrist/ms3t
Draft

feat: add embedded ms3t S3 listener backed by Forge#34
frrist wants to merge 3 commits intomainfrom
frrist/ms3t

Conversation

@frrist
Copy link
Copy Markdown
Member

@frrist frrist commented Apr 29, 2026

To review, start here: https://github.com/storacha/sprue/pull/34/changes#diff-c1e3f8006e0cc6f137969167b1d125433db5f4941d0975f8a7b53abdef81f954

Adds an S3-compatible HTTP listener that runs inside sprue, gated by config.MS3T.Enabled. When enabled, sprue exposes a path-style S3 API on a separate port; PUT/GET/HEAD/DELETE/LIST translate into mutations on a Merkle Search Tree whose blocks ship to piri via sprue's existing piriclient/routing/indexerclient (no UCAN-over-HTTP loopback).

In ms3t.forge.no_cache mode (the smelt-deployed shape):

  • All block reads go through indexer queries + UCAN-authorized ranged retrieves on piri
  • Writes are synchronous to Forge — three round trips per S3 PUT
  • Local state is the registry SQLite (bucket → root CID) and a generated space keypair; ms3t is its own UCAN root authority

When ms3t.forge.enabled is false, falls back to a local-disk uploader for development without Forge connectivity.

See pkg/ms3t/architectural.md for prototype-level design notes, the choice points, and open questions for the team.

Wired into the fx graph via internal/fx/ms3t.go; configuration lives under the new ms3t: block in config.example.yaml.

Adds an S3-compatible HTTP listener that runs inside sprue, gated by
config.MS3T.Enabled. When enabled, sprue exposes a path-style S3 API
on a separate port; PUT/GET/HEAD/DELETE/LIST translate into mutations
on a Merkle Search Tree whose blocks ship to piri via sprue's
existing piriclient/routing/indexerclient (no UCAN-over-HTTP
loopback).

In ms3t.forge.no_cache mode (the smelt-deployed shape):
- All block reads go through indexer queries + UCAN-authorized
  ranged retrieves on piri
- Writes are synchronous to Forge — three round trips per S3 PUT
- Local state is the registry SQLite (bucket → root CID) and a
  generated space keypair; ms3t is its own UCAN root authority

When ms3t.forge.enabled is false, falls back to a local-disk
uploader for development without Forge connectivity.

See pkg/ms3t/architectural.md for prototype-level design notes,
the choice points, and open questions for the team.

Wired into the fx graph via internal/fx/ms3t.go; configuration
lives under the new ms3t: block in config.example.yaml.
@Peeja Peeja self-requested a review April 29, 2026 14:02
Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +279 to +282
- **Why it's awkward**: `aws s3 sync` of many small files is slow.
An MST traversal during a PUT pays N network round trips for N
existing nodes on the path, even though those nodes are
deterministic.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Can we pipeline these requests? Or rather, a) can we support pipelining, and b) do S3 clients typically support it? It wouldn't be slow to hold the PUT open until completion if the next PUT could start before the previous one closed.

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +294 to +297
- **Why we picked this**: zero out-of-band provisioning. The first
time sprue starts with `forge.enabled`, ms3t writes a key and
uses it. No "go ask the delegator for a delegation, paste it
here."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I think the whole provisioning story is pretty undefined right now. Personally, I'd be comfortable with (and recommend) leaving that to a separate decision process (which I think is more or less the idea as written). I think we have room in the options discussed here for whatever that outcome is. But that investigation is going to bring up all sorts of questions of identity and authorization. If we bridge S3 auth to UCAN, what S3 auth are we even bridging? We have product questions here to resolve as much as technical ones.

I'm going to flag that we need that conversation as well, just to make sure that happens.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full agreeement here, this was another quick and dirty decision targeting an MVP. I know @alanshaw has some ideas on placeing bucket metadata (the MST) in its own space and such.

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +308 to +310
Body chunks ride in the same CAR as the structural blocks. The
indexer maps inner CIDs to byte ranges within the outer CAR. One
data-CAR upload + one index-blob upload per PUT.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question/thought: This is addressed by fil-one/RFC#2, correct? Specifically, under that proposal, this would still be two uploads per PUT, but the data upload would be a raw chunk of data, and the index CAR would contain the UnixFS metadata nodes over it. That's nearly the best of both worlds, although we still can't quite do direct passthrough. But we can do relaying by chunks with a fixed buffer size, which is nearly as good.

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +330 to +335
- **Why it's awkward**: the operator running sprue + piri pays
bandwidth twice (client→sprue, sprue→piri) when conceptually
the bytes only need to move once. In a federated model where
piri storage is run by different operators, this becomes
structurally wrong (sprue's operator pays to deposit bytes onto
someone else's hardware).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: There's another reason: trust. Under direct passthrough, we trust the Piri operator to hash and store the data correctly. Under the system here, that trust is places in the facade only, and the use of Piri remains trustless. I think that's the correct alignment. The S3 facade, like the HTTP gateway, requires trust as it bridges from IPLD and UCAN to the outside world. So that layer should ideally be what the customer has to trust, and ideally everything behind it remains just as trustless as before.

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +342 to +344
- **Why we picked this**: zero auth coordination — ms3t is sprue,
it has all sprue's identities and clients in-process. One binary
to ship, one config file.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: I'm not a fan of this identity conflation. I think it makes sense for the moment to put them in the same process/deployment for convenience, but using the same identity smells wrong to me. But this will likely/hopefully be driven out by the full auth story.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, reading the architecture closer, it looks like they're not conflated after all, so maybe I misunderstood what this choice was about?

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +355 to +357
The current code assumes a single ms3t instance per bucket, via the
in-process `sync.Mutex` per-bucket lock. There is no cross-instance
coordination.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What are multiple "instances"? Multiple processes? Does that mean multiple instances of Sprue as well?

Comment thread pkg/ms3t/architectural.md Outdated
type Body struct {
Size int64
ChunkSize int64
Chunks []cid.Cid
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Under fil-one/RFC#2, I think we'll have a single root we can store here instead of individual chunks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! Next steps here would be adopting fil-one/RFC#2 once its settled on. This was just my quick and dirty: "make this work for an MVP"

Comment thread pkg/ms3t/architectural.md Outdated
Comment on lines +342 to +344
- **Why we picked this**: zero auth coordination — ms3t is sprue,
it has all sprue's identities and clients in-process. One binary
to ship, one config file.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, reading the architecture closer, it looks like they're not conflated after all, so maybe I misunderstood what this choice was about?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bootstrap S3 facade: HTTP listener + MST + Forge integration

2 participants