-
Notifications
You must be signed in to change notification settings - Fork 8
fix: car data is validated #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: car data is validated #249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR strengthens IPNI retrieval integrity checks by replacing a size-only CAR validation with a full content round-trip validation that rebuilds a CAR and compares root CIDs (addressing #144).
Changes:
- Added CAR unpacking + round-trip content validation utilities (
unpackCarToPath,validateCarContent) and result typing. - Updated IPNI retrieval validation to run a size pre-check, then perform full content validation when
rootCIDis available. - Added a new Vitest suite covering valid/invalid/corrupt/garbage CAR validation scenarios and introduced required direct dependencies.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Locks new direct dependencies needed for UnixFS export + blockstore usage. |
| apps/backend/src/retrieval-addons/strategies/ipni.strategy.ts | Switches IPNI validation from size-only to size pre-check + content round-trip validation. |
| apps/backend/src/common/car-utils.ts | Implements CAR unpacking via UnixFS exporter and round-trip root CID validation. |
| apps/backend/src/common/car-utils.spec.ts | Adds tests for round-trip correctness and failure modes (corrupt/garbage/wrong CID). |
| apps/backend/package.json | Adds blockstore-core and ipfs-unixfs-exporter as direct deps. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ownloaded-from-sp-or-ipfs-we-do-not-validate-the-actual-content
| class CarReaderBlockstore { | ||
| constructor(private readonly reader: CarReader) {} | ||
|
|
||
| async *get(cid: CID, _options?: unknown): AsyncGenerator<Uint8Array> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it an AsyncGenerator? I think you're just supposed to conform to get(cid): Promise<Uint8Array> to be a blockstore for exporter
I think this might be overengineering the solution here, CIDs should provide this free of charge. There's two ways to look at integrity of these deals:
You shouldn't need to reconstruct anything, CIDs do the work for you, just hash and be assured. Here's the thing about the SP modifying the CAR:
Perhaps you care about both things, but just caring about the latter is legitimate for many users. Either way though, verifying CIDs is enough to validate. From a user perspective: trust the root CID (IPFS root CID or Piece CID, same concept) and you can trust the data underneath it because you have the tools to extend that downward. |
|
Thanks for all the details @rvagg . I agree we should verify the CIDs match - that is good enough. Lets do |
Summary
Adds CAR content validation to IPNI retrieval, replacing the size-only check with a full round-trip integrity verification.
Problem
When retrieving data from storage providers via IPNI, we only validated that the downloaded CAR file's byte count matched the expected size. A corrupted or tampered CAR file would pass validation as long as its length was correct. (Fixes #144)
Solution
We validate retrieved CARs by round-tripping the content:
If the CIDs match, the entire chain is proven valid: CAR structure, UnixFS DAG integrity, block linkage, and content correctness. The existing size check is kept as a fast pre-check to fail immediately on obvious mismatches before doing the heavier round-trip.
Changes
car-utils.ts— AddedunpackCarToPath()andvalidateCarContent()functions, plusCarValidationResulttypecar-utils.spec.ts— New test file with 6 tests: round-trip CID matching, byte-level content preservation, valid/invalid/corrupt/garbage CAR validationipni.strategy.ts—validateDatanow runs content validation after size pre-check; method updated fromcar-size-checktocar-content-validation; removed TODO referencing bug: when we retrieve or verify content downloaded from SP or IPFS, we do not validate the actual content #144package.json— Addedblockstore-coreandipfs-unixfs-exporteras direct deps (already present as transitive deps offilecoin-pin)Notes for reviewers
validateCarContentwrites to temp dirs and cleans up in afinallyblock. The extracted content lives only for the duration of the validation call.