RFC: New forge deployment architecture proposal#6
Conversation
| door to its Piri." Bytes never cross regions; only UCAN control plane | ||
| traffic does. |
There was a problem hiding this comment.
Bytes never cross regions; only UCAN control plane traffic does.
🤩
| | **S3 facade + Guppy** | Provider, regional | Receives client bytes; data prep next to where bytes will land | | ||
| | **Piri** | Provider, regional | Storage | | ||
| | **MST content catalog** | Provider, regional | Per-bucket — single-writer-per-bucket falls out naturally | | ||
| | **Upload Service (sprue)** | Fil One, central | Sees every blob/add, index/add, upload/add — full control plane | |
There was a problem hiding this comment.
Do we have any concerns about the central upload service becoming a single point of failure?
| | **MST content catalog** | Provider, regional | Per-bucket — single-writer-per-bucket falls out naturally | | ||
| | **Upload Service (sprue)** | Fil One, central | Sees every blob/add, index/add, upload/add — full control plane | | ||
| | **Indexing Service** | TBD: central at Fil One, OR regional with IPNI as the only global indexer | See "Where the Indexing Service lives" below | | ||
| | **KMS, Delegator, Etracker, Signing Svc** | Fil One, central | Same as #2 | |
There was a problem hiding this comment.
KMS
Aurora flagged that it's not feasible to store millions of DEKs in a KMS like Hashicorp Vault. Their solution is to store MEK (per-tenant or per-bucket, I am not sure) in KMS, and store DEKs locally, encrypted using MEK.
I think that's compatible with what this RFC describes:
- MEK is stored in FilOne's KMS
- DEKs are stored next to Piri/Guppy at the Provider side, encrypted by MEK
| 1. Client GET hits provider's Guppy. | ||
| 2. Guppy resolves S3 key → root CID via the local MST. | ||
| 3. Guppy queries local Piri directly for the blob (and byte ranges if | ||
| known from the local index). |
There was a problem hiding this comment.
This list is missing the auth check step. Guppy(?) must verify that the requester is authorised to access the S3 object, using the AWS SigV4 scheme.
| known from the local index). | ||
| 4. Streams back to client. | ||
|
|
||
| No Indexing Service hop, no cross-region RTT, no dependency on the |
There was a problem hiding this comment.
no cross-region RTT
This assumes Guppy can verify SigV4 auth locally.
I think we will want to manage S3 Access Keys globally on the FilOne side, in which case the retrieval request will need one cross-region RTT to verify client's authorisation.
| I'm leaning **(b)**: the cross-region consistency problem was the only thing | ||
| that had been stopping regional indexers, and proposal #3's bucket-affinity | ||
| model resolves it for free. Fil One can run a small Indexing-Service-style | ||
| cache for observability/dashboards, but it isn't on a hot path. Wants team | ||
| input before locking in. |
There was a problem hiding this comment.
With my limited understanding, I prefer option (a).
- It's a simpler deployment. Less work for OCs, fewer things outside of FilOne's control that can bring the system down.
- I am sceptical about the real customer demand for the indexing service. If it's not something that most of our early adopters need, then let's take the easiest path to tick this checkbox and defer more complex options until we validate the demand.
- This seems like a two-way decision to me - if the central indexing service is no longer sufficient, we can decentralise later.
A few questions that can help us decide better:
- What's the cost of incremental approach - start with central indexing service (a) now, move to per-provider indexer service (b) later?
- The cost of building parts of (a) that will be thrown away when we move to (b).
- The cost of upgrading existing OCs to add an indexing service to their deployments.
- How can we validate and quantify demand for the indexing service? Is it something we must ship as part of the initial Forge region launch?
|
|
||
| | Component | Runs at | Why there | | ||
| |---|---|---| | ||
| | **S3 facade + Guppy** | Provider, regional | Receives client bytes; data prep next to where bytes will land | |
There was a problem hiding this comment.
One thing I am missing here: how can FilOne get visibility into telemetry metrics of Provider's S3 Facade?
We would like to get at least the following metrics:
- TTFB (GetObject method only)
- Response times (with the method name as a dimension)
- Total request count, 4xx/5xx error count (with method name as a dimension)
- egress (overall rate)
- ingress (overall rate)
Right now, we are scraping Aurora's minIO metrics exposed via the Prometheus endpoint. I think it should be feasible to do the same with Forge.
There was a problem hiding this comment.
Most of this exists already. Piri and Guppy ship instrumented with OpenTelemetry — traces and metrics emit by default, with an opt-out config flag for providers who object on principle. At Storacha we ran an OTel collector and pointed it at Grafana for dashboards and alerts; same pattern fits here. Fil One stands up a collector, providers' services emit to it, and the metrics you listed (TTFB, response times by method, request/error counts, ingress/egress) fall out of the existing instrumentation without further work.
One wrinkle worth naming: a collector accepting metrics from anyone claiming to be a provider is not the trust model we want to ship. A small plugin that verifies each batch is signed by the reporting provider's known key, and drops the rest, would close that gap — same cryptographic identity we already use for UCAN, so no new key infrastructure. More complexity than just deploying the collector, but probably the right complexity if Fil One ends up billing or alarming off these numbers.
The ceiling, either way: even with signing, we are still trusting the numbers. Signing proves a provider sent them, not that they are true. A provider that wants to under-report 5xx counts or shave a few milliseconds off TTFB can do so, sign it cleanly, and we will accept it. Catching that needs an independent source — synthetic probes, client-side telemetry, the Upload Service's own view cross-checked against the facade's. Out of scope for the basic setup, but worth flagging: the collector is a trust-the-reporter system either way.
alanshaw
left a comment
There was a problem hiding this comment.
Overall I think this is a good organization of the services and it allows for some interesting optimizaitons that mitigate a lot of the problems we've been wrestling with. In my head I have (up until now) pushed the issue of arrangement down the line becasue the protocol largely allows for any organization of the parts. It's great to have a strong and reasoned proposal on the table.
| has a home Provider, and `web3.storage/blob/allocate` for that Space must go | ||
| to the home Provider's Piri. No load-balancing across Piris within a region |
There was a problem hiding this comment.
One complication is that the Piri node(s) will have to be accessible from outside of the organization's network in order to receive invocations from the upload service. This may be a big ask in some cases.
|
|
||
| **1. The Guppy↔Piri local handoff** is unspecified, and unspecified things | ||
| are usually harder than they look. Three plausible options: **(a) local HTTP | ||
| import endpoint on Piri** (simple, fits Piri's existing model); |
There was a problem hiding this comment.
What is the difference here vs the URL you receive to HTTP PUT to?
| **1. The Guppy↔Piri local handoff** is unspecified, and unspecified things | ||
| are usually harder than they look. Three plausible options: **(a) local HTTP | ||
| import endpoint on Piri** (simple, fits Piri's existing model); | ||
| **(b) shared volume / object backend** (clean if Piri's storage backend |
There was a problem hiding this comment.
Can you clarify why this is shared?
Is the idea this: Guppy puts PutObject shards to the shared volume directly, then the blob allocate is sent and it doesn't return a URL because Piri sees that it already has the blob?
This all sounds like optimizations. I would build it using the existing blob protocol first to reduce scope and get something working and then do an iteration to make it fast.
| 2. Guppy resolves S3 key → root CID via the local MST. | ||
| 3. Guppy queries local Piri directly for the blob (and byte ranges if | ||
| known from the local index). | ||
| 4. Streams back to client. |
There was a problem hiding this comment.
We'd still want guppy to generate retrieval invocations and receipts generated and reported to the etracker...right? Not so much for payment but for reporting.
|
|
||
| 1. **Piri storage backend abstraction**: does it support a "Guppy writes, | ||
| Piri verifies and accepts" pattern cleanly today? (For Forrest/Paul.) | ||
| 2. **Cross-region UCAN RTT** between provider Guppy and central sprue — |
There was a problem hiding this comment.
Thought: we could run federated upload services in multiple regions. Routing information for nodes in the network could differ at each upload service to prefer routing to nodes within the same region and replicating to nodes outside of the region.
|
|
||
| If the team agrees, the immediate next steps are: | ||
|
|
||
| 1. A focused design doc on the Guppy↔Piri local handoff. |
There was a problem hiding this comment.
I think this is the only part I'm unsure about. These are certainly valid optimizations, but they couple the S3 facade to a single Piri instance. I appreciate that is perhaps "the point", but I can't escape the feeling that we should first get the facade working such that it can upload/retrieve from any Piri and then layer in the mode of single instance binding and all the optimizations that come with it.
Preview
I've really been noodling with the icky feeling that we still haven't got the plan for how to deploy Forge on FilOne right. The first plan was to run a whole Guppy + UploadService + indexing service ourselves and just have the providers run Piri. But that woulda died on egress cost and perf. Then we decided to move everything to the provider... which is where we sit now, but man does it feel broken... all the cool Forge advantages have been taken away... leading to @Peeja 's take that Forge almost feels unneccesary. (I agree)
Today while talking to @frrist I had an insight that feels like it threads the needle: run guppy and Piri on the provider, but have us keep running the upload service. It sounds unintuitive at first but when you play it out, it seems to solve all the vexxing problems at once. This proposal is my attempt to lay out what that looks like and why it works, I think, dramatically better. That said, I have no idea if this is an insight worthy of my job title or just Claude's sycophancy getting me high on my own supply. So please, tell me.
Also: one reason I like this is for all the things it sets us up to do in the future easily: guppy clients that run directly on computers or in client on prem infra, cross region replication, global retrieval -- including a CDN product maybe.