Skip to content

Add fsck support#260

Draft
cgwalters wants to merge 17 commits intocomposefs:mainfrom
cgwalters:fsck
Draft

Add fsck support#260
cgwalters wants to merge 17 commits intocomposefs:mainfrom
cgwalters:fsck

Conversation

@cgwalters
Copy link
Collaborator

Depends: #256

Move Debug impls for format types and EROFS structures to the top
of the file (before ImageVisitor), extract hexdump helper, and
add missing_debug_implementations allows. Pure reorganization,
no functional changes.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Convert the assert_eq! in ImageVisitor::note() to return an error
instead of panicking when a corrupt image has the same offset visited
as two different segment types. Found by the debug_image fuzz target.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Fix arithmetic operations that could overflow, underflow, or cause
resource exhaustion when processing malformed EROFS images:

- Use checked_mul instead of unchecked << for block address
  calculations in debug.rs
- Use checked_add for block range end computation in reader.rs to
  prevent u64 overflow
- Use usize::BITS instead of hardcoded 64 for blkszbits validation
  (correct on 32-bit platforms)
- Use usize::try_from instead of 'as usize' casts for inode size,
  inode ID, and block ID to avoid silent truncation on 32-bit
- Cap Vec allocation against image length to prevent OOM from crafted
  size fields
- Use saturating_sub for debug display calculations

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Replace direct slice indexing with .get() where the bounds come from
image content: XAttr::suffix/value/padding, Inode::inline, and
debug_img's unassigned-region slicing. This prevents panics on
malformed images where field values are inconsistent with actual data
lengths.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
…pers

Change XAttr::suffix(), value(), and padding() to return
Result<&[u8], ErofsReaderError> instead of silently returning empty
slices on out-of-bounds access. This ensures corrupt xattr data
is properly reported rather than silently swallowed.

Also deduplicate is_whiteout() (moved to InodeHeader trait method)
and find_child_nid() (moved to Image method), and remove the
redundant entry_nid() test helper in favor of DirectoryEntry::nid().

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Add fuzz testing infrastructure under crates/composefs/fuzz/ with two
targets: read_image (exercises the full reader API surface including
inode traversal, xattr parsing, and object collection) and debug_image
(runs the debug_img dump on arbitrary input). Includes a seed corpus
generator that creates valid EROFS images exercising various code paths.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
…verflow

A crafted EROFS image with directory cycles can cause unbounded recursion
in populate_directory(), leading to a stack overflow. Add a depth parameter
and enforce a maximum of PATH_MAX / 2 (2048) levels, matching the
theoretical limit for valid filesystem paths.

Found by cargo-fuzz.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
The cargo-fuzz targets found multiple panics within seconds of fuzzing.
Convert all remaining .unwrap() calls and assert!() macros in non-test
reader code to return Result, and propagate errors at all call sites.

Key changes:
- data_layout() returns Result instead of unwrapping TryInto
- XAttr::from_prefix(), xattrs(), shared(), local() return Result
- DirectoryBlock::n_entries/entries/get_entry_header return Result
- DirectoryEntries iterator yields Result<DirectoryEntry>
- XAttrIter yields Result<&XAttr>
- All callers in reader.rs, debug.rs, and fuzz targets updated

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
This got introduced in a CI refactoring and wasn't
intentional. Our fuzzing had way too short of a timeout.
If CI job is actually stuck we'll figure that out when
it happens.

Signed-off-by: Colin Walters <walters@verbum.org>
…ks()

The fuzzer found a crafted EROFS image where an ExtendedInodeHeader has
an enormous size field (~63 petabytes), causing blocks() to return a
range of ~15.5 trillion block IDs. Iterating this range caused a timeout.

Change our flow so that we pass the image (including its size)
when iterating blocks, so we can validate those.

Also add a default 1 GiB maximum image size in Image::open(), since
composefs images are metadata-only and should never approach that.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
Composefs images are metadata-only EROFS images with well-known
structural constraints. Add an opt-in restriction mode that enforces:

- blkszbits must be 12 (4096-byte blocks)
- For non-ChunkBased inodes (directories, inline files, symlinks,
  devices), size must not exceed the image size, since their data
  is stored within the image itself.  ChunkBased (external) files
  are exempt because their size reflects the real file on the
  underlying filesystem.

The high-level collect_objects() and erofs_to_filesystem() APIs now
enable this by default.  Lower-level callers using Image::open()
directly can opt in via .restrict_to_composefs().

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
…nd metacopy checks

Validate composefs header magic and EROFS format version, superblock
magic, enforce the INLINE_CONTENT_MAX (64 byte) limit on inline regular
files, and reject malformed trusted.overlay.metacopy xattrs instead of
silently ignoring them.

The composefs header version field is validated but composefs_version is
not, since the C mkcomposefs writes version 0 while the Rust writer
uses version 2.

Previously, a malformed metacopy xattr would be silently ignored,
causing the file to be treated as inline rather than external. In
composefs-restricted mode this is now an error with a detailed
diagnostic message.

Cap the proptest inline file data strategy at INLINE_CONTENT_MAX to
match the composefs invariant that files > 64 bytes are external.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
The dumpfile parser accepted inline content up to 5000 bytes, which is
far beyond any reasonable composefs inline file size. Reduce to 512
bytes as a safety bound while still allowing room for future increases
to the inline-vs-external threshold (see composefs#107 for discussion of
adjusting INLINE_CONTENT_MAX per hash algorithm).

Update the special.dump test data to use 63/64/256-byte inline files
instead of the previous 4095/4096/4097-byte entries that exceeded the
new limit.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
…NTENT

Rename the writer's inline threshold to INLINE_CONTENT_MAX_V0 to make
it clear that changing this value is effectively a format break: it
determines which files get fs-verity checksums vs. stored inline, so
images from different thresholds aren't interchangeable. A future
composefs format version will need to encode this in the header.

Add MAX_INLINE_CONTENT (512 bytes) in lib.rs as the shared parsing
safety bound for untrusted input. Both the dumpfile parser and the
EROFS reader (in composefs-restricted mode) use this limit. It is
intentionally higher than V0 to allow for future threshold increases
per issue composefs#107.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
EROFS is a complex format supporting compression, metabox inodes,
and more. Whereas for composefs we only use it as a metadata
format, and we have a custom writer which is conservative in
what features it uses.

Add currently known EROFS feature_compat and feature_incompat flag
constants in format.rs. When we're in `restrict_to_composefs()` mode,
we filter these up front.

This should drastically cut down on the attack surface exposed
by malicious EROFS images when mounted directly by the Linux kernel.

Assisted-by: OpenCode (Claude Opus)
Signed-off-by: Colin Walters <walters@verbum.org>
Add cap-std and cap-tempfile as dev-dependencies to composefs and
composefs-oci for capability-scoped filesystem manipulation in tests.

Add TestRepo::path() for accessing the repository's filesystem path,
and TestRepo::dir() for getting a cap_std::fs::Dir handle scoped to
the repository root (preventing accidental path traversal in tests).

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Implement `cfsctl fsck` and `cfsctl oci fsck` commands for verifying
composefs repository integrity at multiple levels.

Repository-level fsck validates:
- Object fsverity digests match path-derived identifiers
- Stream and image symlinks resolve to existing objects
- Refs resolve through the full symlink chain
- Splitstream headers parse correctly and referenced objects exist
- EROFS images are structurally valid (composefs-restricted parsing)
  and all objects referenced via overlay.metacopy xattrs exist

OCI-level fsck additionally validates:
- Manifest and config content sha256 digests
- Layer diff_id references and stream entries
- Seal image existence for sealed images
- Artifact layer reference consistency

Both commands support `--json` for machine-readable output that always
exits 0 (only process-level failures cause non-zero exit). Without
`--json`, non-zero exit indicates corruption was found.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant