Skip to content

feat(node): replication enforcement (Phase 2) for #18#34

Merged
kevincodex1 merged 20 commits into
Gitlawb:mainfrom
beardthelion:feat/phase2-replication-enforcement
Jun 18, 2026
Merged

feat(node): replication enforcement (Phase 2) for #18#34
kevincodex1 merged 20 commits into
Gitlawb:mainfrom
beardthelion:feat/phase2-replication-enforcement

Conversation

@beardthelion

@beardthelion beardthelion commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Phase 2 of path-scoped visibility (#18): stop withheld content from leaving the origin node through replication, and stop fully-private repos from being announced to the network. Phase 1 (#25) gates the git read path and Phase 3 (#28) withholds blobs from served packs, but after a push three paths still copied objects off the node ignoring visibility: local IPFS pinning, Pinata pinning, and the gossip/peer-notify/Arweave announcements.

The whole thing reduces to one decision computed once per push in git_receive_pack: can an anonymous caller read the repo root, and which blob OIDs are denied to the public. A withheld: Option<HashSet<String>> drives both pin sites (None means the repo is private, so nothing replicates, not even commit and tree objects), and an announce bool gates the network-facing announcements.

What changes:

  • IPFS and Pinata pinning skip the withheld blob OIDs (via a small pure replicable_objects filter). For a private repo they pin nothing at all, so file names in tree objects and history in commit objects no longer reach public IPFS.
  • Gossip ref-update publish, the HTTP peer-notify fallback, and Arweave anchoring are suppressed for repos the public cannot read. Mode B repos (public with a private subtree) still announce, since their commit and tree SHAs are public.
  • Fail closed: if visibility can't be determined, the push replicates nothing.
  • The in-process GraphQL subscription broadcast and the local branch->CID write are left alone; they are owner-facing/local, not network leaks.

Deferred on purpose, each cheap to add later off the same seam: peer partial-mirrors (peers currently fail closed on repos with withheld content), UCAN-delegated reader sets, and encrypted-at-rest replication of private blobs.

Depends on #28: withheld_blob_oids lives on that branch. This PR is stacked on it, so until #28 merges the diff here will also show #28's commits. Rebase onto main once #28 lands.

Test plan

  • cargo test -p gitlawb-node (100 pass), cargo clippy --all-targets -D warnings clean, cargo fmt --check clean
  • Unit coverage: replicable_objects filter, anonymous-caller contract of withheld_blob_oids, and the announce gate across public / legacy-private / mode A / mode B
  • Manual: push to a node with a mode B /secret/** rule, confirm the secret blob is absent from IPFS/Pinata while public files and the commit/tree are present
  • Manual: push to a fully-private repo, confirm no objects pinned and no gossip/peer-notify/Arweave anchor

Summary by CodeRabbit

  • New Features

    • Repository visibility rules now control which objects are withheld from replication and peer synchronization.
    • IPFS and Pinata pinning operations now respect visibility rules and only pin allowed objects.
    • Replication decisions are enforced at the push level, with downstream dissemination and permanent anchoring gated by visibility.
  • Bug Fixes

    • Git protocol errors are now properly remapped to appropriate HTTP status codes.
  • Tests

    • Added test coverage for visibility rule enforcement during replication announcements.

…al clone

upload_pack_excluding emitted a v2 packfile section, but info_refs
advertises v0, so real clients negotiated v0 and rejected the response
with 'expected ACK/NAK, got packfile'. Frame the v0 stateless-rpc shape
instead (NAK, then the pack via side-band-64k when offered).

Add an end-to-end test that stands up info_refs + upload_pack_excluding
and runs a real git partial clone, asserting the withheld blob's bytes
never reach the client while its tree entry and SHA stay visible. A stock
full clone cannot consume the pack (it is not closed under reachability,
so fetch fails the connectivity check); a partial clone is required.
…tion choice

Add a real-git test that partial-clones, pushes a new commit server-side,
then fetches: the new object arrives and the withheld blob stays absent.
This pins down that ignoring have/want negotiation (always sending a
self-contained pack of all refs minus withheld, with NAK) is correct for
both clone and fetch; the only cost is a fetch re-sends the full object
set. Refactor the real-git tests onto a shared server harness and document
the negotiation decision in code and in the plan's follow-ups.
Move the two blocking git shell-outs in the filtered upload-pack path off
the async worker thread, matching the tokio::process / spawn_blocking usage
already in this file: build_filtered_pack (rev-list + pack-objects) and
withheld_blob_oids (per-ref ls-tree) now run inside spawn_blocking so a large
repo cannot stall the tokio runtime. Behavior is unchanged.

Also fix the Task 0 findings block in the Phase 3 plan: it still recorded v2
packfile framing, which is the exact path that failed against a real client
and was corrected to v0. The block now documents the shipped v0 contract.
Drop a stray trailing code fence flagged by markdownlint (MD040).

The speculative ls-tree timeout and the public/no-rules fast-path from the
review are intentionally left out: the timeout guards against adversarial
repos we do not yet host, and the fast-path is a micro-optimization not worth
the extra branch right now.
kevincodex1 asked to keep the superpowers planning docs out of the repo. The
Phase 3 plan was scaffolding for this change, not something the project needs
to carry. Removing it leaves only the code and tests in the PR.
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 81fffcb0-0340-441c-ac15-6ffe51fc861e

📥 Commits

Reviewing files that changed from the base of the PR and between 083293d and d44ad34.

📒 Files selected for processing (2)
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/visibility.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/gitlawb-node/src/visibility.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/gitlawb-node/src/api/repos.rs

📝 Walkthrough

Walkthrough

Adds a new visibility_pack module that computes withheld blob OIDs from path-scoped visibility rules, then wires this into git-upload-pack (filtered pack serving) and git-receive-pack (announce-gated replication). IPFS and Pinata pinning APIs gain a withheld parameter; P2P publishing, HTTP peer sync, and Arweave anchoring are all suppressed when announce is false.

Changes

Visibility-aware blob withholding for Git read and replication

Layer / File(s) Summary
visibility_pack: withheld OID computation and replication filtering
crates/gitlawb-node/src/git/visibility_pack.rs, crates/gitlawb-node/src/visibility.rs
New module enumerates blob OIDs from all head/tag refs via git ls-tree -r, applies visibility_check per path, and returns OIDs denied at every occurrence. replicable_objects filters ordered object lists. Comprehensive tests cover anonymous, non-reader, owner, and reader scenarios. Adds announce_gate_matches_public_readability test asserting allow/deny across rule modes.
Pinning APIs: withheld set parameter
crates/gitlawb-node/src/ipfs_pin.rs, crates/gitlawb-node/src/pinata.rs
Both pin_new_objects functions gain a withheld: &HashSet<String> parameter and call replicable_objects(object_list, withheld) before DB status checks and upload, restricting pinned objects to non-withheld blobs.
Receive-pack Phase 2 replication gating
crates/gitlawb-node/src/api/repos.rs
Per-push announce flag is derived from visibility rules (fail-closed to false). When false, withheld is None and all downstream steps (IPFS/Pinata pinning, P2P ref-update publish, HTTP peer sync, Arweave anchoring) are skipped. When true, withheld OIDs are computed and passed to pinning; dissemination and Arweave steps execute inside if announce guards.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related issues

Possibly related PRs

  • Gitlawb/node#25: Phase 1 visibility enforcement in git-info-refs/git-upload-pack using visibility_check in the same api/repos.rs endpoints that this PR extends with Phase 2 blob withholding.
  • Gitlawb/node#28: Introduces visibility_pack::withheld_blob_oids and the upload_pack_excluding filtered-pack serving path that this PR integrates into the upload-pack handler.
  • Gitlawb/node#33: Shares visibility_check-driven denied-subtree logic and the announce unit test pattern that this PR extends with receive-pack gating.

Suggested reviewers

  • kevincodex1

🐇 A blob hides behind a rule,
No withheld byte shall slip through the pack,
announce says nay? We keep it cool —
No pin, no peer, no Arweave track.
The rabbit guards each secret file,
Hop hop, only allowed things compile! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly references Phase 2 replication enforcement, which is the primary focus of the PR, and relates directly to the main objective of preventing withheld content from leaving the origin node.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 629-644: The match arm currently calls
crate::git::visibility_pack::withheld_blob_oids(...) directly on the async
worker (using disk_path, rules, record.is_public, &record.owner_did), which must
be moved into a blocking task; replace the direct call with
tokio::task::spawn_blocking(||
crate::git::visibility_pack::withheld_blob_oids(...)).await handling
(propagate/map the Result->Option the same way and keep the tracing::warn! on
errors) so the git ls-tree subprocess runs off the async runtime thread.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a00a4311-c564-4086-b45f-866546839dd1

📥 Commits

Reviewing files that changed from the base of the PR and between 6abaf1d and 949d131.

📒 Files selected for processing (8)
  • .gitignore
  • crates/gitlawb-node/src/api/repos.rs
  • crates/gitlawb-node/src/git/mod.rs
  • crates/gitlawb-node/src/git/smart_http.rs
  • crates/gitlawb-node/src/git/visibility_pack.rs
  • crates/gitlawb-node/src/ipfs_pin.rs
  • crates/gitlawb-node/src/pinata.rs
  • crates/gitlawb-node/src/visibility.rs

Comment thread crates/gitlawb-node/src/api/repos.rs
The receive-pack replication chokepoint called withheld_blob_oids
directly on the tokio worker, where its blocking git ls-tree walk can
stall the runtime for repos with many refs. Wrap it in spawn_blocking
to match the upload-pack serve path.
@kevincodex1

Copy link
Copy Markdown
Contributor

bro @beardthelion please send me dm in X.

@kevincodex1

Copy link
Copy Markdown
Contributor

hello bro @beardthelion please rebase and fix conflicts.

# Conflicts:
#	crates/gitlawb-node/src/git/visibility_pack.rs
@kevincodex1 kevincodex1 merged commit 8680d0f into Gitlawb:main Jun 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants