cfsctl: Add cmd to get object ref for a file#236
Conversation
Also: this commit message only says "what" not why. The "why" could be very short, something like "I wanted this to access the underlying file for a kernel binary so I could..." |
1eaca87 to
588570f
Compare
My bad, should've added context. Added some in the new commit message |
588570f to
81861f2
Compare
|
Rebased and fixed conflicts |
81861f2 to
08caf94
Compare
crates/cfsctl/src/main.rs
Outdated
| #[derive(Debug, Serialize)] | ||
| pub struct ObjectRef { | ||
| /// Whether the file is stored inline in the EROFS image | ||
| pub stored_inline: bool, |
There was a problem hiding this comment.
What problem did you hit in making this an enum? serde supports that https://serde.rs/enum-representations.html
There was a problem hiding this comment.
I had it as
enum ObjectRef {
Internal,
External { digest: String, path: String }
}serde serialised Internal as just "Internal".
I didn't know about the untagged trait. Do you think an Enum would be better here?
3de3226 to
67d1f35
Compare
67d1f35 to
c619bbe
Compare
crates/composefs/src/generic_tree.rs
Outdated
| where | ||
| S: serde::Serializer, | ||
| { | ||
| s.serialize_str(v.to_string_lossy().as_ref()) |
There was a problem hiding this comment.
I'm not usually a fan of to_string_lossy except for human output. The problem is it just means the data isn't usable for machines in the general case.
crates/composefs/src/generic_tree.rs
Outdated
|
|
||
| /// Content types for leaf nodes (non-directory files). | ||
| #[derive(Debug)] | ||
| #[derive(Debug, Clone, Serialize)] |
There was a problem hiding this comment.
I know I asked for this but let me argue to undo it with a different suggestion.
crates/cfsctl/src/main.rs
Outdated
| }, | ||
| /// Given a file path, get the file corresponding to it in the object store | ||
| /// If the file if stored inline, returns an empty string | ||
| StatFile { |
There was a problem hiding this comment.
OK I thought about this more and I have a suggestion that solves the "non-UTF8 filename" problem and would align better with the "LayerInspect" verb: let's have this output a single line in composefs dump file format.
That includes all relevant metadata (backing file path, file metadata) in a "standard" format.
We could then call it dump-single-file perhaps? Or since we're here we could easily have it support multiple files (why not?) so how about
dump-files <image> [files]
And I'd also suggest we add a --with-parents so the output is a truly valid dump file that could be used to make new EROFS files in theory if desired.
And we can just drop the enum serialization stuff - we already have a textual serialization of a composefs filesystem tree.
There was a problem hiding this comment.
(maybe --with-parents is on by default...it's only a bit more verbose)
There was a problem hiding this comment.
we already have a textual serialization of a composefs filesystem tree
I was not aware of this? Where do we have it?
OK I thought about this more and I have a suggestion that solves the "non-UTF8 filename" problem and would align better with the "LayerInspect" verb: let's have this output a single line in composefs dump file format.
that would be great, but that's not quite the intent here. The intent is more so just getting the corresponding file in the objects directory. Maybe I'm misunderstanding this but your suggestion is to output something like the following?
/usr/share/apk/keys/alpine-devel@lists.alpinelinux.org-66ba20fe.rsa.pub 800 100644 1 0 0 0 1758215632.0 1b/60ad29665d7a6c58e5dfc74d8bbfab074543c866820e5d3e4368054b62a263aba2bc157624305f2c8402
c446a6921d9e84d3d0409fd9731a9503a9982e872c - 1b60ad29665d7a6c58e5dfc74d8bbfab074543c866820e5d3e4368054b62a263aba2bc157624305f2c8402c446a6921d9e84d3d0409fd9731a9503a9982e872c
There was a problem hiding this comment.
I was not aware of this? Where do we have it?
https://github.com/composefs/composefs-rs/blob/main/crates/composefs/src/dumpfile.rs
There was a problem hiding this comment.
How about dump-files --backing-path which outputs just the path(s) newline separated and quoted in composefs dumpfile format?
But the default would be full dumpfile metadata.
There was a problem hiding this comment.
Added a new command that operates on the EROFS and outputs in a dumpfile format
This is what the output looks like
$ ./cfsctl --repo /tmp/.tmpqzyqeM dump-files refs/my-image / --recursive
/ 0 40755 4 1000 1000 0 1773135856.0 - - -
/etc 0 40755 2 1000 1000 0 1773135856.0 - - -
/usr 0 40755 4 1000 1000 0 1773135856.0 - - -
/usr/bin 0 40755 2 1000 1000 0 1773135856.0 - - -
/usr/lib 0 40755 2 1000 1000 0 1773135856.0 - - -
/etc/hostname 17 100644 1 1000 1000 0 1773135856.0 - integration-test\x0a -
/usr/bin/hello 131072 100644 1 1000 1000 0 1773135856.0 18/c6776e8d37d2076bd66e7160f2c3cab90ce064dc8cdb1d10954c2c7007fd1e83963da6c78fcfb30783c1a37ff36d7255718324e64f414acfd5cefc1ade05a3 - 18c
6776e8d37d2076bd66e7160f2c3cab90ce064dc8cdb1d10954c2c7007fd1e83963da6c78fcfb30783c1a37ff36d7255718324e64f414acfd5cefc1ade05a3 trusted.overlay.metacopy=\x00D\x00\x02\x18\xc6wn\x8d7\xd2\x07k\xd
6nq`\xf2\xc3\xca\xb9\x0c\xe0d\xdc\x8c\xdb\x1d\x10\x95L,p\x07\xfd\x1e\x83\x96\x3d\xa6\xc7\x8f\xcf\xb3\x07\x83\xc1\xa3\x7f\xf3mrUq\x83$\xe6OAJ\xcf\xd5\xce\xfc\x1a\xde\x05\xa3 trusted.overlay.re
direct=/18/c6776e8d37d2076bd66e7160f2c3cab90ce064dc8cdb1d10954c2c7007fd1e83963da6c78fcfb30783c1a37ff36d7255718324e64f414acfd5cefc1ade05a3
/usr/lib/readme.txt 13 100644 1 1000 1000 0 1773135856.0 - test\x20fixture\x0a -
There was a problem hiding this comment.
With only backing path
$ ./cfsctl --repo /tmp/.tmpqzyqeM dump-files refs/my-image /usr/bin/hello --backing-path
/usr/bin/hello /18/c6776e8d37d2076bd66e7160f2c3cab90ce064dc8cdb1d10954c2c7007fd1e83963da6c78fcfb30783c1a37ff36d7255718324e64f414acfd5cefc1ade05a3c619bbe to
a75d67d
Compare
|
Also we should unify and fix https://github.com/composefs/composefs-rs/blob/main/crates/composefs-oci/src/oci_image.rs#L1015 to operate in terms of the erofs not the tar I think |
f2d26f1 to
5cd008c
Compare
66902dd to
94eeb96
Compare
cgwalters
left a comment
There was a problem hiding this comment.
There's a lot of code churn in the last commit that seems unrelated to "use sha512" - I think it'd make sense to squash.
| let data = extract_file_data(inode, img); | ||
| LeafContent::Regular(RegularFile::Inline(data.into())) | ||
| } | ||
| // TODO: Sha256HashValue not handled here because the trait FsVerityHashValue |
There was a problem hiding this comment.
This relates to #181 - instead of using generics after that we'll need the repo itself to handle this.
| } | ||
|
|
||
| /// Extract xattrs from an inode | ||
| fn extract_xattrs_from_inode( |
There was a problem hiding this comment.
Hmmm... do we not already have code that creates a Tree from erofs? I think it would make sense to do instead of this basically
There was a problem hiding this comment.
There's one that creates a Tree from a dumpfile, but I'm not sure if there's code that creates a Tree from an erofs image. We do have an erofs debugger, but that prints output in a different format
There was a problem hiding this comment.
Right...ok I can take a crack at that but let's get #247 in first?
Actually...OK I am sorry about this I really scope creeped your PR. I think the larger issues are valid, but maybe we should just get in the feature you originally requested and then tackle the larger erofs/dumpfile stuff after?
How about this: We making --backing-path a required option for now, and always output the backing paths in dumpfile format?
Actually I am going to take a crack at adding erofs -> Tree code
This command recevies an EROFS image and file paths, and produces a dump file like ouput. If `--backing-path-only` is provided, only the backing object digest is outputted in the format `<file name> <digest>` If a directory is passed as input instead of a file, it works like the `ls` command printing information for all files/directories inside Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com> refactor: Refactor duplicated code in cfsctl Move out duplicated filesystem creation code blocks to separate functions Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
94eeb96 to
feb81b1
Compare
|
Rebased on top of #252. Makes the code much simpler |
|
This one also relates to #225 |
Add a command
GetFileObjectRefthat accepts an absolute file path and returns the ref in objects directory corresponding to that file