From 5a496d291484e41dcb2ddf9e54d2010c4c00ded3 Mon Sep 17 00:00:00 2001 From: Alan Shaw Date: Wed, 29 Apr 2026 12:33:38 +0100 Subject: [PATCH 1/2] docs: wip sharding strategy --- ...01-forge-s3-flat-file-sharding-strategy.md | 37 +++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 rfcs/001-forge-s3-flat-file-sharding-strategy.md diff --git a/rfcs/001-forge-s3-flat-file-sharding-strategy.md b/rfcs/001-forge-s3-flat-file-sharding-strategy.md new file mode 100644 index 0000000..e5bfee4 --- /dev/null +++ b/rfcs/001-forge-s3-flat-file-sharding-strategy.md @@ -0,0 +1,37 @@ +# RFC: Forge S3 Facade sharding strategy + +Status: Experimental + +## Authors + +- [Alan Shaw](https://github.com/alanshaw) + +## Motivation + +The Forge S3 Facade must support Multipart uploads via `UploadPart` as well as single part uploads via `PutObject`. + +This is a proposal that will allow Forge to support this with minimal changes to the network, leaning on ideas previously proposed by the team: + +* https://github.com/storacha/RFC/pull/65 +* https://github.com/storacha/RFC/pull/66 + +## Design + +The Sharded DAG Index will gain a `blocks` property - a list of blocks "inlined" in the index. See [#66](https://github.com/storacha/RFC/pull/66). + +For a given upload (a `PutObject` or `UploadPart` request) the S3 facade will shard data at the current threshold (256MB) and keep track of the CIDs of the shards that have been created as well as the order. + +The order is important for retrieval, since we need to serve data from the shards in the order it appears in the object that is being uploaded. + +When the `PutObject` request ends or the multipart `CompleteMultipartUpload` request is received, a UnixFS File root node is created that links to ALL shards in order. + +A Sharded DAG Index is created, that contains entries for each shard. Each shard entry has a single slice entry which is the slice of the entire shard from byte 0 to (up to) 256MB. + +The order in which the shards should be served is encoded in the root UnixFS node, which is added to the Sharded DAG Index `blocks` property. + +To enable retrieval Guppy will need to be updated to consider the `blocks` property, and load the blocks into it's own cache. Retrieval can then proceed as usual. + +## References + +* https://github.com/storacha/RFC/pull/65 +* https://github.com/storacha/RFC/pull/66 From cf23c23ba62cd01a6e71a157745f058152ce2dc3 Mon Sep 17 00:00:00 2001 From: Alan Shaw Date: Tue, 5 May 2026 13:54:18 +0100 Subject: [PATCH 2/2] refactor: prefix and blocks name --- ...y.md => 2026-04-forge-s3-flat-file-sharding-strategy.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename rfcs/{001-forge-s3-flat-file-sharding-strategy.md => 2026-04-forge-s3-flat-file-sharding-strategy.md} (81%) diff --git a/rfcs/001-forge-s3-flat-file-sharding-strategy.md b/rfcs/2026-04-forge-s3-flat-file-sharding-strategy.md similarity index 81% rename from rfcs/001-forge-s3-flat-file-sharding-strategy.md rename to rfcs/2026-04-forge-s3-flat-file-sharding-strategy.md index e5bfee4..635ccc4 100644 --- a/rfcs/001-forge-s3-flat-file-sharding-strategy.md +++ b/rfcs/2026-04-forge-s3-flat-file-sharding-strategy.md @@ -17,7 +17,7 @@ This is a proposal that will allow Forge to support this with minimal changes to ## Design -The Sharded DAG Index will gain a `blocks` property - a list of blocks "inlined" in the index. See [#66](https://github.com/storacha/RFC/pull/66). +The Sharded DAG Index will gain a `nodes` property - a list of UnixFS nodes "inlined" in the index (renamed from `blocks` as proposed in [#66](https://github.com/storacha/RFC/pull/66)). For a given upload (a `PutObject` or `UploadPart` request) the S3 facade will shard data at the current threshold (256MB) and keep track of the CIDs of the shards that have been created as well as the order. @@ -27,9 +27,9 @@ When the `PutObject` request ends or the multipart `CompleteMultipartUpload` req A Sharded DAG Index is created, that contains entries for each shard. Each shard entry has a single slice entry which is the slice of the entire shard from byte 0 to (up to) 256MB. -The order in which the shards should be served is encoded in the root UnixFS node, which is added to the Sharded DAG Index `blocks` property. +The order in which the shards should be served is encoded in the root UnixFS node, which is added to the Sharded DAG Index `nodes` property. -To enable retrieval Guppy will need to be updated to consider the `blocks` property, and load the blocks into it's own cache. Retrieval can then proceed as usual. +To enable retrieval Guppy will need to be updated to consider the `nodes` property, and load the node(s) into it's own cache. Retrieval can then proceed as usual. ## References