Summary
Two issues in crates/gitlawb-node/src/git/visibility_pack.rs (blob_paths, the basis of the withheld-blob set for Phase 3 subtree withholding) can produce an incomplete withheld set, which downstream code then treats as public — a potential cleartext disclosure of withheld content.
1. Withholding only walks refs/heads/* and refs/tags/*
if !refname.starts_with("refs/heads/") && !refname.starts_with("refs/tags/") {
continue;
}
blob_paths enumerates blob→path pairs from heads and tags only. But the consumers of the resulting withheld set operate over a broader object graph:
crates/gitlawb-node/src/git/smart_http.rs uses rev-list --objects --all
crates/gitlawb-node/src/ipfs_pin.rs filters all repo objects through it
A blob reachable only from another ref namespace (e.g. refs/notes/*, refs/merge-requests/*, hidden refs) never enters withheld, so it can be packed or pinned in cleartext even though its path is denied by visibility rules.
2. Ref walk failures fail open
if !listing.status.success() {
continue;
}
A non-zero git ls-tree result is skipped, turning a traversal error into a partial withheld set. Callers then treat the missing blobs as public. This should fail closed (return an error so the fetch/pack/pin aborts) rather than silently leak.
Suggested direction
- Enumerate the object graph for withholding with the same ref/object scope the pack and pin paths use (align on one shared enumerator over
--all / the full reachable set), so nothing the consumers see can escape the withheld computation.
- On any
git ls-tree / ref-walk error, return Err so the caller aborts instead of producing a partial set.
Notes
This is Phase 3 (#28, issue #18) code. Surfaced by CodeRabbit during review of #40 (it appears in that PR only because the cross-fork diff is cumulative over the unmerged stack). Filing here so it's tracked against the withholding work rather than the recipient-set-blinding PR.
Summary
Two issues in
crates/gitlawb-node/src/git/visibility_pack.rs(blob_paths, the basis of the withheld-blob set for Phase 3 subtree withholding) can produce an incompletewithheldset, which downstream code then treats as public — a potential cleartext disclosure of withheld content.1. Withholding only walks
refs/heads/*andrefs/tags/*blob_pathsenumerates blob→path pairs from heads and tags only. But the consumers of the resultingwithheldset operate over a broader object graph:crates/gitlawb-node/src/git/smart_http.rsusesrev-list --objects --allcrates/gitlawb-node/src/ipfs_pin.rsfilters all repo objects through itA blob reachable only from another ref namespace (e.g.
refs/notes/*,refs/merge-requests/*, hidden refs) never enterswithheld, so it can be packed or pinned in cleartext even though its path is denied by visibility rules.2. Ref walk failures fail open
A non-zero
git ls-treeresult is skipped, turning a traversal error into a partialwithheldset. Callers then treat the missing blobs as public. This should fail closed (return an error so the fetch/pack/pin aborts) rather than silently leak.Suggested direction
--all/ the full reachable set), so nothing the consumers see can escape the withheld computation.git ls-tree/ ref-walk error, returnErrso the caller aborts instead of producing a partial set.Notes
This is Phase 3 (#28, issue #18) code. Surfaced by CodeRabbit during review of #40 (it appears in that PR only because the cross-fork diff is cumulative over the unmerged stack). Filing here so it's tracked against the withholding work rather than the recipient-set-blinding PR.