AgentData exists and can persist to filesystem#21
Conversation
bfce2af to
3f9b663
Compare
3f9b663 to
7215e2d
Compare
AgentData exists and can persist to filesystem
|
I'd be tempted to define An implementation would read and write to a PEM file per the RFC so that appropriate permissions could be set. It might also allow the private key to be read into memory temporarily when needed rather than kept around permanently. For delegations I'd be tempted to store each one as a CAR file in a directory. The iterator would allow you to not have to buffer all the delegations into memory at once, and having each delegation as a CAR would make it easier to inspect each one individually for debugging. It's also easier to add and remove delegations - you don't have to encode and decode everything every time you make a small change. Thinking about it, you kinda just want a blockstore and some sort of index allowing you to match on audience, ability and resource to return a delegation (and it's linked proofs if it has any). You could of course have an implementation that does what you're doing here (saving everything to a single JSON) but I'm not convinced you're gaining much from the format - everything is encoded as bytes so things are not really any more readable and I'm assuming interop with the JS implementation is not a requirement. |
volmedo
left a comment
There was a problem hiding this comment.
this is so cool Petra, welcome to Go land!
Most of the comments are on the instructive/educational side. I think code reviews are a great way of learning a new language, especially the subtleties and common pitfalls. Let me know if you find them useful and/or want to know more about any other aspect.
| Delegations [][]byte | ||
| } | ||
|
|
||
| func (ad AgentData) MarshalJSON() ([]byte, error) { |
There was a problem hiding this comment.
[go hint] I'd use a pointer receiver here, even though we are not mutating it within the method. The general recommendation is to use value receivers until there is a reason to use a pointer receiver. In this case, however, AgentData.Delegations could potentially contain several elements. Using a pointer receiver avoids copying the whole thing on each method call, which is what happens when using value receivers.
There was a problem hiding this comment.
Yeah, that would make sense, but unfortunately, it doesn't work. The MarshalJSON API is defined with a value receiver. Switching it to a pointer means it just stops getting marshaled. 😕
There was a problem hiding this comment.
this is a tricky aspect of Go. Methods that are defined on the value type can be invoked from both values and pointers. OTOH, methods defined on the pointer type can only be accessed from pointers.
This means that if you define MarshalJSON on the value type, both values and pointers implement the Marshaler interface. If it is defined on the pointer type, only pointers will implement the interface. Thus, it will work as expected if what you pass to json.Marshal is a pointer.
In any case, let's leave it as it is because that avoids the pitfall of having to remember that only pointers will get marshaled as expected. I think this is the main reason most implementations use a value receiver for MarshalJSON, and it's a solid one IMHO.
There was a problem hiding this comment.
all in all, I'd like to take my original comment back 😄. I was trying to explain the difference between value and pointer receivers but ended up delivering an example of premature optimization.
Besides, there was an error in my reasoning. The concern was that AgentData.Delegations might be big. However, it is a slice, which actually contains a pointer to an underlying array, so the actual memory is not copied even if the slice descriptor is (see https://go.dev/blog/slices-intro if you are interested).
| delegations := make([][]byte, len(ad.Delegations)) | ||
| for i, d := range ad.Delegations { | ||
| b, err := io.ReadAll(d.Archive()) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("reading delegation archive: %w", err) | ||
| } | ||
| delegations[i] = b | ||
| } |
There was a problem hiding this comment.
[go hint] This code is correct. However, a common and idiomatic alternative that allows forgetting about handling indexes explicitly would be:
| delegations := make([][]byte, len(ad.Delegations)) | |
| for i, d := range ad.Delegations { | |
| b, err := io.ReadAll(d.Archive()) | |
| if err != nil { | |
| return nil, fmt.Errorf("reading delegation archive: %w", err) | |
| } | |
| delegations[i] = b | |
| } | |
| delegations := make([][]byte, 0, len(ad.Delegations)) | |
| for _, d := range ad.Delegations { | |
| b, err := io.ReadAll(d.Archive()) | |
| if err != nil { | |
| return nil, fmt.Errorf("reading delegation archive: %w", err) | |
| } | |
| delegations = append(delegations, b) | |
| } |
Note the added parameter in the call to make. It indicates we want the slice to have the required capacity but 0 length. Without it we'd get a slice with len(ad.Delegations) empty byte slices and then we would start appending more slices after those.
| code, err := varint.ReadUvarint(bytes.NewReader(s.Principal)) | ||
| if err != nil { | ||
| return fmt.Errorf("reading private key codec: %s", err) | ||
| } | ||
|
|
||
| switch code { | ||
| case ed25519signer.Code: | ||
| ad.Principal, err = ed25519signer.Decode(s.Principal) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| case rsasigner.Code: | ||
| ad.Principal, err = rsasigner.Decode(s.Principal) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| default: | ||
| return fmt.Errorf("invalid private key codec: %d", code) | ||
| } |
There was a problem hiding this comment.
np: alternatively, we could move this logic to a convenience function in go-ucanto/principal
There was a problem hiding this comment.
Yeah, I think we should; I looked for it there when I wrote this. → storacha/go-ucanto#46
| audienceDid, err := did.Parse("did:mailto:example.com:alice") | ||
| require.NoError(t, err) | ||
|
|
||
| del, err := greet.Delegate( |
There was a problem hiding this comment.
given that we don't care about the type of delegation and we are not mocking any special behavior in the greet ability, why not just use any existing ability from go-libstoracha/capabilities and get rid of fixture_test.go?
There was a problem hiding this comment.
Yeaaaaaaah, to be honest, I thought it was going to be a lot less code to write one inline than it turned out to be. 😅 I agree, I'll do that.
| nil, | ||
| ) | ||
|
|
||
| var delegationsCids = func(d agentdata.AgentData) []ipld.Link { |
There was a problem hiding this comment.
why not just:
| var delegationsCids = func(d agentdata.AgentData) []ipld.Link { | |
| func delegationsCids(d agentdata.AgentData) []ipld.Link { |
😉
| type AgentDataStore interface { | ||
| Write(AgentData) error | ||
| Read() (AgentData, error) | ||
| } |
There was a problem hiding this comment.
[go hint] Given that the type will be prefixed by the package name when used outside this package, it's common practice to use the package name as part of the type name in a way. For instance, users of this type will declare variables of type agentdata.AgentDataStore. Thus, we could call this type just Store to remove some redundancy and make its usage a bit more ergonomic.
| type AgentDataStore interface { | ||
| Write(AgentData) error | ||
| Read() (AgentData, error) | ||
| } |
There was a problem hiding this comment.
[go hint] Also, I'd refrain from adding interfaces until there is actually a need for them. If there is a single implementation, the interface just adds a level of indirection. I think part of the beauty of Go as a language is that it is easy to read and follow the code flow, and interfaces put a tax on that.
It is usual that concrete types end up having a corresponding interface because just a simple mock for tests is a different implementation. But I will add the interface only when there is no alternative. IMHO "this interface will likely be required in the future" is not a strong enough argument 😄.
There was a problem hiding this comment.
Yeah, that makes sense. I was planning to have some polymorphic reference to the store somewhere, but the agent hasn't really developed enough to do that yet, so it's really just used by the tests for the one implementation so far. Pretty sure it'll come back, but I'm totally happy to drop it in this PR.
There was a problem hiding this comment.
awesome! Yes, adding all the boilerplate back when required will be easy to do and in the meantime we get much more readable code.
| type FSAgentDataStore struct { | ||
| path string | ||
| } |
There was a problem hiding this comment.
given the simplicity of this type and that it doesn't manage any state, we could also just offer a couple WriteToFile/ReadFromFile functions and simplify the API a bit.
There was a problem hiding this comment.
I'm trying to mimic the JS agent here by making the API potentially polymorphic and open to new stores. But given I can't really think of another use case right now (since this can't go in a browser or anything), maybe it's easy enough to regrow that flexibility later if we need it.
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| const dataFilePath = "testdata/agentdata.json" |
There was a problem hiding this comment.
instead of declaring the const here, you could use a cool feature of the testing.T object, which is the TempDir() method.
Within the test itself, you could do:
dataFilePath := path.Join(t.TempDir(), "agentdata.json")and you'll get your own temp dir to use for that test run. The cool thing is that the temp dir will be automatically removed when the test ends, so you won't need the os.RemoveAll at the beginning of the test.
Also, since you'll be using the OS' tmp directory, you can get rid of the testdata directory.
|
|
||
| del, err := greet.Delegate( | ||
| signer, | ||
| simplePrincipal{did: audienceDid}, |
There was a problem hiding this comment.
you don't need to create your own simplePrincipal type for this 🙂. Likely what you were looking for is "github.com/storacha/go-ucanto/did".Parse. It returns a did.DID, which implements the ucan.Principal interface you need here.
There was a problem hiding this comment.
Oh! I didn't realize did.DID implemented DID() to return itself! Then I've already got audienceDid. Cheers!
There was a problem hiding this comment.
oh, true! I missed that you already had a did.DID!
That's exactly what I want, and I was disappointed to find it didn't already exist in all the IPLD stuff. 😞 I want I totally agree with all your points here, but I'd rather defer it all to future work, unless you think the permissions on the key material is a blocker. But if I understand right, what this does is the equivalent of what the JS client does. And I think it's not too hard to evolve this thing as it gets used, rather than design it perfectly up front. As for JSON, literally the only reason to use it is that I spent the entire day trying not to and gave up. 😄 It turned out to be way too much work to do anything more IPLD-native just to have a persistence layer no one will need to interoperate with. |
Making a dummy capability just for the tests is actually more work than using a real one.
Just concretely read/write from files.
|
|
||
| var ad AgentData | ||
| json.Unmarshal(b, &ad) | ||
| return ad, nil |
There was a problem hiding this comment.
@volmedo Is returning the value here a problem? Am I making a whole extra copy for no reason? If I return a pointer, it seems like it makes the API weird.
There was a problem hiding this comment.
yeah, you are making a whole extra copy, but there is a good reason, and that is to not make the API weird 😄.
The flaw in the logic of my comment on MarshalJSON also applies here. Even it is a copy, it's just the pointer, and not the actual memory, what we are copying.
volmedo
left a comment
There was a problem hiding this comment.
this looks good to me, great job!
I agree that JSON is not the best format, but I also agree we can essentially use whatever as there are no compatibility, portability or performance requirements for this and we can evolve it in the future when those come.
As a final nitpick, given that we no longer have AgentDataStore or FSAgentDataStore, and the API was reduced to a couple of functions, I'd put them in agentdata.go, merge all test code in agentdata_test.go and get rid of the rest of the files.
Don't feel pushed to do it, I'm again just trying to be instructive and illustrate that putting all the methods and functions relevant to a type in a single file is a common way of structuring Go code (until the file becomes unwieldy, that is 😄).
Ah! Yeah, that makes a lot of sense. Done. |

This feels like an odd place to cut off the PR, but I don't think there's a better one. This essentially defines two things: An
AgentDatatype that can serialize to and from JSON, and a store implementation that serializesAgentDatato and from the disk. The pieces are tested, but not used yet. TheAgentdoesn't exist yet, but there's room for it to. The pieces will come together in #17, when we really exercise theAgentifiedClientfor the first time.Please review for idiom and convention as much as correctness—I may still be organizing my Go code weirdly. 😅
Closes #16
PR Dependency Tree
AgentDataexists and can persist to filesystem #21 👈This tree was auto-generated by Charcoal