Skip to content

AgentData exists and can persist to filesystem#21

Merged
Peeja merged 7 commits intomainfrom
feat/login/agent-data-store
Jun 6, 2025
Merged

AgentData exists and can persist to filesystem#21
Peeja merged 7 commits intomainfrom
feat/login/agent-data-store

Conversation

@Peeja
Copy link
Copy Markdown
Member

@Peeja Peeja commented Jun 4, 2025

This feels like an odd place to cut off the PR, but I don't think there's a better one. This essentially defines two things: An AgentData type that can serialize to and from JSON, and a store implementation that serializes AgentData to and from the disk. The pieces are tested, but not used yet. The Agent doesn't exist yet, but there's room for it to. The pieces will come together in #17, when we really exercise the Agentified Client for the first time.

Please review for idiom and convention as much as correctness—I may still be organizing my Go code weirdly. 😅

Closes #16

PR Dependency Tree

This tree was auto-generated by Charcoal

@Peeja Peeja force-pushed the feat/login/agent-data-store branch from bfce2af to 3f9b663 Compare June 4, 2025 18:34
@Peeja Peeja force-pushed the feat/login/agent-data-store branch from 3f9b663 to 7215e2d Compare June 4, 2025 20:17
@Peeja Peeja changed the title WIP - Trying to round-trip a data structure AgentData exists and can persist to filesystem Jun 4, 2025
@Peeja Peeja marked this pull request as ready for review June 4, 2025 20:23
@alanshaw
Copy link
Copy Markdown
Member

alanshaw commented Jun 4, 2025

I'd be tempted to define AgentData as an interface that exposes a Prinicpal() function and a Delegations() function that returns an iterator for delegations.

An implementation would read and write to a PEM file per the RFC so that appropriate permissions could be set. It might also allow the private key to be read into memory temporarily when needed rather than kept around permanently.

For delegations I'd be tempted to store each one as a CAR file in a directory. The iterator would allow you to not have to buffer all the delegations into memory at once, and having each delegation as a CAR would make it easier to inspect each one individually for debugging. It's also easier to add and remove delegations - you don't have to encode and decode everything every time you make a small change.

Thinking about it, you kinda just want a blockstore and some sort of index allowing you to match on audience, ability and resource to return a delegation (and it's linked proofs if it has any).

You could of course have an implementation that does what you're doing here (saving everything to a single JSON) but I'm not convinced you're gaining much from the format - everything is encoded as bytes so things are not really any more readable and I'm assuming interop with the JS implementation is not a requirement.

Copy link
Copy Markdown
Contributor

@volmedo volmedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is so cool Petra, welcome to Go land!

Most of the comments are on the instructive/educational side. I think code reviews are a great way of learning a new language, especially the subtleties and common pitfalls. Let me know if you find them useful and/or want to know more about any other aspect.

Delegations [][]byte
}

func (ad AgentData) MarshalJSON() ([]byte, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[go hint] I'd use a pointer receiver here, even though we are not mutating it within the method. The general recommendation is to use value receivers until there is a reason to use a pointer receiver. In this case, however, AgentData.Delegations could potentially contain several elements. Using a pointer receiver avoids copying the whole thing on each method call, which is what happens when using value receivers.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would make sense, but unfortunately, it doesn't work. The MarshalJSON API is defined with a value receiver. Switching it to a pointer means it just stops getting marshaled. 😕

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a tricky aspect of Go. Methods that are defined on the value type can be invoked from both values and pointers. OTOH, methods defined on the pointer type can only be accessed from pointers.

This means that if you define MarshalJSON on the value type, both values and pointers implement the Marshaler interface. If it is defined on the pointer type, only pointers will implement the interface. Thus, it will work as expected if what you pass to json.Marshal is a pointer.

In any case, let's leave it as it is because that avoids the pitfall of having to remember that only pointers will get marshaled as expected. I think this is the main reason most implementations use a value receiver for MarshalJSON, and it's a solid one IMHO.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all in all, I'd like to take my original comment back 😄. I was trying to explain the difference between value and pointer receivers but ended up delivering an example of premature optimization.

Besides, there was an error in my reasoning. The concern was that AgentData.Delegations might be big. However, it is a slice, which actually contains a pointer to an underlying array, so the actual memory is not copied even if the slice descriptor is (see https://go.dev/blog/slices-intro if you are interested).

Comment thread agent/agentdata/agentdata.go Outdated
Comment on lines +27 to +34
delegations := make([][]byte, len(ad.Delegations))
for i, d := range ad.Delegations {
b, err := io.ReadAll(d.Archive())
if err != nil {
return nil, fmt.Errorf("reading delegation archive: %w", err)
}
delegations[i] = b
}
Copy link
Copy Markdown
Contributor

@volmedo volmedo Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[go hint] This code is correct. However, a common and idiomatic alternative that allows forgetting about handling indexes explicitly would be:

Suggested change
delegations := make([][]byte, len(ad.Delegations))
for i, d := range ad.Delegations {
b, err := io.ReadAll(d.Archive())
if err != nil {
return nil, fmt.Errorf("reading delegation archive: %w", err)
}
delegations[i] = b
}
delegations := make([][]byte, 0, len(ad.Delegations))
for _, d := range ad.Delegations {
b, err := io.ReadAll(d.Archive())
if err != nil {
return nil, fmt.Errorf("reading delegation archive: %w", err)
}
delegations = append(delegations, b)
}

Note the added parameter in the call to make. It indicates we want the slice to have the required capacity but 0 length. Without it we'd get a slice with len(ad.Delegations) empty byte slices and then we would start appending more slices after those.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, awesome, thanks!

Comment on lines +50 to +70
code, err := varint.ReadUvarint(bytes.NewReader(s.Principal))
if err != nil {
return fmt.Errorf("reading private key codec: %s", err)
}

switch code {
case ed25519signer.Code:
ad.Principal, err = ed25519signer.Decode(s.Principal)
if err != nil {
return err
}

case rsasigner.Code:
ad.Principal, err = rsasigner.Decode(s.Principal)
if err != nil {
return err
}

default:
return fmt.Errorf("invalid private key codec: %d", code)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np: alternatively, we could move this logic to a convenience function in go-ucanto/principal

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should; I looked for it there when I wrote this. → storacha/go-ucanto#46

Comment thread agent/agentdata/agentdata_test.go Outdated
audienceDid, err := did.Parse("did:mailto:example.com:alice")
require.NoError(t, err)

del, err := greet.Delegate(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we don't care about the type of delegation and we are not mocking any special behavior in the greet ability, why not just use any existing ability from go-libstoracha/capabilities and get rid of fixture_test.go?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeaaaaaaah, to be honest, I thought it was going to be a lot less code to write one inline than it turned out to be. 😅 I agree, I'll do that.

Comment thread agent/agentdata/fixture_test.go Outdated
nil,
)

var delegationsCids = func(d agentdata.AgentData) []ipld.Link {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just:

Suggested change
var delegationsCids = func(d agentdata.AgentData) []ipld.Link {
func delegationsCids(d agentdata.AgentData) []ipld.Link {

😉

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Comment thread agent/agentdata/store.go Outdated
Comment on lines +8 to +11
type AgentDataStore interface {
Write(AgentData) error
Read() (AgentData, error)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[go hint] Given that the type will be prefixed by the package name when used outside this package, it's common practice to use the package name as part of the type name in a way. For instance, users of this type will declare variables of type agentdata.AgentDataStore. Thus, we could call this type just Store to remove some redundancy and make its usage a bit more ergonomic.

Comment thread agent/agentdata/store.go Outdated
Comment on lines +8 to +11
type AgentDataStore interface {
Write(AgentData) error
Read() (AgentData, error)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[go hint] Also, I'd refrain from adding interfaces until there is actually a need for them. If there is a single implementation, the interface just adds a level of indirection. I think part of the beauty of Go as a language is that it is easy to read and follow the code flow, and interfaces put a tax on that.

It is usual that concrete types end up having a corresponding interface because just a simple mock for tests is a different implementation. But I will add the interface only when there is no alternative. IMHO "this interface will likely be required in the future" is not a strong enough argument 😄.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. I was planning to have some polymorphic reference to the store somewhere, but the agent hasn't really developed enough to do that yet, so it's really just used by the tests for the one implementation so far. Pretty sure it'll come back, but I'm totally happy to drop it in this PR.

Copy link
Copy Markdown
Contributor

@volmedo volmedo Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! Yes, adding all the boilerplate back when required will be easy to do and in the meantime we get much more readable code.

Comment thread agent/agentdata/store.go Outdated
Comment on lines +13 to +15
type FSAgentDataStore struct {
path string
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the simplicity of this type and that it doesn't manage any state, we could also just offer a couple WriteToFile/ReadFromFile functions and simplify the API a bit.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to mimic the JS agent here by making the API potentially polymorphic and open to new stores. But given I can't really think of another use case right now (since this can't go in a browser or anything), maybe it's easy enough to regrow that flexibility later if we need it.

Comment thread agent/agentdata/store_test.go Outdated
"github.com/stretchr/testify/require"
)

const dataFilePath = "testdata/agentdata.json"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of declaring the const here, you could use a cool feature of the testing.T object, which is the TempDir() method.

Within the test itself, you could do:

dataFilePath := path.Join(t.TempDir(), "agentdata.json")

and you'll get your own temp dir to use for that test run. The cool thing is that the temp dir will be automatically removed when the test ends, so you won't need the os.RemoveAll at the beginning of the test.

Also, since you'll be using the OS' tmp directory, you can get rid of the testdata directory.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!!! That's fantastic!

Comment thread agent/agentdata/store_test.go Outdated

del, err := greet.Delegate(
signer,
simplePrincipal{did: audienceDid},
Copy link
Copy Markdown
Contributor

@volmedo volmedo Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to create your own simplePrincipal type for this 🙂. Likely what you were looking for is "github.com/storacha/go-ucanto/did".Parse. It returns a did.DID, which implements the ucan.Principal interface you need here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I didn't realize did.DID implemented DID() to return itself! Then I've already got audienceDid. Cheers!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, true! I missed that you already had a did.DID!

@Peeja
Copy link
Copy Markdown
Member Author

Peeja commented Jun 5, 2025

Thinking about it, you kinda just want a blockstore and some sort of index allowing you to match on audience, ability and resource to return a delegation (and it's linked proofs if it has any).

That's exactly what I want, and I was disappointed to find it didn't already exist in all the IPLD stuff. 😞 I want quadstore for blocks, or an easy way to query over CARs like querying over HDTs.

I totally agree with all your points here, but I'd rather defer it all to future work, unless you think the permissions on the key material is a blocker. But if I understand right, what this does is the equivalent of what the JS client does. And I think it's not too hard to evolve this thing as it gets used, rather than design it perfectly up front.

As for JSON, literally the only reason to use it is that I spent the entire day trying not to and gave up. 😄 It turned out to be way too much work to do anything more IPLD-native just to have a persistence layer no one will need to interoperate with.

Peeja added 4 commits June 5, 2025 11:56
Making a dummy capability just for the tests is actually more work than
using a real one.
Just concretely read/write from files.
@Peeja Peeja requested a review from volmedo June 5, 2025 16:37
Comment thread agent/agentdata/store.go Outdated

var ad AgentData
json.Unmarshal(b, &ad)
return ad, nil
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@volmedo Is returning the value here a problem? Am I making a whole extra copy for no reason? If I return a pointer, it seems like it makes the API weird.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you are making a whole extra copy, but there is a good reason, and that is to not make the API weird 😄.

The flaw in the logic of my comment on MarshalJSON also applies here. Even it is a copy, it's just the pointer, and not the actual memory, what we are copying.

Copy link
Copy Markdown
Contributor

@volmedo volmedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good to me, great job!

I agree that JSON is not the best format, but I also agree we can essentially use whatever as there are no compatibility, portability or performance requirements for this and we can evolve it in the future when those come.

As a final nitpick, given that we no longer have AgentDataStore or FSAgentDataStore, and the API was reduced to a couple of functions, I'd put them in agentdata.go, merge all test code in agentdata_test.go and get rid of the rest of the files.

Don't feel pushed to do it, I'm again just trying to be instructive and illustrate that putting all the methods and functions relevant to a type in a single file is a common way of structuring Go code (until the file becomes unwieldy, that is 😄).

@Peeja
Copy link
Copy Markdown
Member Author

Peeja commented Jun 6, 2025

As a final nitpick, given that we no longer have AgentDataStore or FSAgentDataStore, and the API was reduced to a couple of functions, I'd put them in agentdata.go, merge all test code in agentdata_test.go and get rid of the rest of the files.

Ah! Yeah, that makes a lot of sense. Done.

@Peeja Peeja closed this Jun 6, 2025
@Peeja Peeja deleted the feat/login/agent-data-store branch June 6, 2025 13:48
@Peeja Peeja restored the feat/login/agent-data-store branch June 6, 2025 13:48
@Peeja Peeja reopened this Jun 6, 2025
@Peeja Peeja merged commit 98225ae into main Jun 6, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Go client should have a persistent agent with delegations and other data

3 participants