Skip to content

Make LogEntry::Metadata a generic associated type#210

Open
lukevalenta wants to merge 1 commit intomainfrom
lvalenta/metadata-associated-type-v2
Open

Make LogEntry::Metadata a generic associated type#210
lukevalenta wants to merge 1 commit intomainfrom
lvalenta/metadata-associated-type-v2

Conversation

@lukevalenta
Copy link
Copy Markdown
Contributor

@lukevalenta lukevalenta commented Apr 13, 2026

Adds a Metadata associated type to the LogEntry trait in tlog_tiles,
allowing each tlog application to define its own sequence metadata type
rather than sharing the fixed (LeafIndex, UnixTimestamp) tuple.

tlog_tiles:

  • Adds LogEntry::Metadata: Serialize + DeserializeOwned + Copy + Default
  • Adds LogEntry::make_metadata(leaf_index, timestamp, old_tree_size,
    new_tree_size) to construct the metadata from raw sequencer values
  • TlogTilesLogEntry::Metadata = SequenceMetadata (unchanged behavior)
  • SequenceMetadata type alias is preserved for backward compatibility

static_ct_api, bootstrap_mtc_api:

  • Metadata = SequenceMetadata; make_metadata ignores tree sizes

generic_log_worker:

  • All cache infrastructure (DedupCache, MemoryCache, CacheRead,
    CacheWrite) is now generic over M: the metadata type
  • serialize_entries/deserialize_entries switched from fixed 32-byte
    binary format to serde_json (the binary format only worked for the
    fixed-size (u64, u64) tuple)
  • PoolState<P, M>, AddLeafResult, add_leaf_to_pool, sequence_entries,
    GenericSequencer, GenericBatcher all generified over M
  • sequence_entries calls L::make_metadata(n, timestamp, old_size, new_size)
  • Assert n == new_size after the sequencing loop
  • CheckpointCallbacker gains old_tree_size and new_tree_size parameters
    so callbacks can identify which subtrees were covered by each batch

@lukevalenta
Copy link
Copy Markdown
Contributor Author

/bonk review

Copy link
Copy Markdown

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This is a well-structured generalization that opens the door for future MTC metadata types without disrupting existing applications. The design is clean and the backward-compatibility story is solid.

What I like

Trait design — The LogEntry::Metadata associated type with make_metadata(leaf_index, timestamp, old_tree_size, new_tree_size) is a good API. Passing all four sequencer-computed values and letting each application pick what it needs is future-proof without being overly complex.

Binary format preservation — The CacheSerialize trait with a hand-rolled binary impl for SequenceMetadata preserves the existing 32-byte wire format in Durable Object storage. The test_sequence_metadata_cache_format_unchanged regression test is an excellent addition — this is exactly the kind of thing that catches subtle corruption bugs before they reach production.

impl_json_cache_serialize! macro — Good escape hatch for new metadata types that don't need a custom binary format. The JSON fallback is the right default for types that aren't performance-critical in the dedup cache path.

Invariant assertionassert_eq!(n, new_size, ...) at line 929 of log_ops.rs is a welcome safety net. Computing new_size upfront and asserting it matches the loop counter makes the sequencing logic easier to reason about.

Minor observations (non-blocking)

  1. CacheSerialize deserialization error type — The trait uses Result<..., String> as the error type. This works fine since callers convert via Into<worker::Error>, but a dedicated error type or Box<dyn Error> would be more idiomatic Rust. Not worth changing now, just noting for future reference.

  2. Bound repetition — The M: CacheSerialize + Serialize + DeserializeOwned + Clone + Copy + Default + 'static bound appears on several impl blocks (DedupCache<M>, CacheWrite for DedupCache<M>, CacheRead for DedupCache<M>, GenericBatcher<M>). A trait alias or a supertrait on CacheSerialize that bundles these constraints would reduce duplication, though this is purely cosmetic.

  3. dedup_cache_fifo_key extraction — The free function dedup_cache_fifo_key was extracted to support compute_cache_keys_to_load being called without a DedupCache instance (since DedupCache is now generic). This is a clean solution. The DedupCache::fifo_key method delegates to it, keeping the two in sync.

Tests

All existing tests pass. New tests cover the JSON serialization path (test_serialize_deserialize_json_roundtrip, test_deserialize_invalid_json) and the binary format regression (test_sequence_metadata_cache_format_unchanged). Clippy is clean with -D warnings.

LGTM.

@ask-bonk
Copy link
Copy Markdown

ask-bonk bot commented Apr 13, 2026

I've reviewed and approved PR #210. Here's a summary:

Verdict: Approved

The PR cleanly generalizes LogEntry with a Metadata associated type, allowing each tlog application to define its own sequence metadata rather than being locked into the fixed (LeafIndex, UnixTimestamp) tuple. Key strengths:

  • Good trait designmake_metadata(leaf_index, timestamp, old_tree_size, new_tree_size) passes all sequencer values, letting each application pick what it needs
  • Binary format preserved — The CacheSerialize trait with a hand-rolled impl for SequenceMetadata keeps the existing 32-byte wire format in DO storage, with a regression test to guard against future breakage
  • Clean generification — All cache infrastructure (DedupCache, MemoryCache, CacheRead, CacheWrite), pool state, and batcher/sequencer DOs are properly parameterized over M
  • impl_json_cache_serialize! macro — Good escape hatch for new metadata types that don't need custom binary serialization
  • Safety assertionassert_eq!(n, new_size) after the sequencing loop catches logic errors early

All tests pass and clippy is clean.

github run

@lukevalenta lukevalenta force-pushed the lvalenta/metadata-associated-type-v2 branch 2 times, most recently from d173bd4 to e6574ec Compare April 14, 2026 03:09
Copy link
Copy Markdown
Contributor

@lbaquerofierro lbaquerofierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change overall. The backward compatibility is really well done

@lukevalenta lukevalenta self-assigned this Apr 14, 2026
Adds a Metadata associated type to the LogEntry trait in tlog_tiles,
allowing each tlog application to define its own sequence metadata type
rather than sharing the fixed (LeafIndex, UnixTimestamp) tuple.

tlog_tiles:
- Adds LogEntry::Metadata: Serialize + DeserializeOwned + Copy + Default
- Adds LogEntry::make_metadata(leaf_index, timestamp, old_tree_size,
  new_tree_size) to construct the metadata from raw sequencer values
- TlogTilesLogEntry::Metadata = SequenceMetadata (unchanged behavior)
- SequenceMetadata type alias is preserved for backward compatibility
- Improve make_metadata doc comment: clarify that SequenceMetadata
  ignores tree sizes while other implementations may use all four args

static_ct_api, bootstrap_mtc_api:
- Metadata = SequenceMetadata; make_metadata ignores tree sizes

generic_log_worker:
- All cache infrastructure (DedupCache, MemoryCache, CacheRead,
  CacheWrite) is now generic over M: the metadata type
- serialize_entries/deserialize_entries switched from fixed 32-byte
  binary format to serde_json (the binary format only worked for the
  fixed-size (u64, u64) tuple)
- PoolState<P, M>, AddLeafResult<M>, add_leaf_to_pool, sequence_entries,
  GenericSequencer, GenericBatcher all generified over M
- sequence_entries calls L::make_metadata(n, timestamp, old_size, new_size)
- Assert n == new_size after the sequencing loop
- CheckpointCallbacker gains old_tree_size and new_tree_size parameters
  so callbacks can identify which subtrees were covered by each batch
@lukevalenta lukevalenta force-pushed the lvalenta/metadata-associated-type-v2 branch from e6574ec to 0777cec Compare April 14, 2026 17:25
use worker::*;

pub struct GenericBatcher {
pub struct GenericBatcher<M> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to comment what M is. Or even better the whole struct if you feel like it


// A batch of entries to be submitted to the Sequencer together.
struct Batch {
struct Batch<M> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


// A fixed-size in-memory FIFO cache.
struct MemoryCache {
struct MemoryCache<M> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


// Stage leftover partial data tile, if any.
if n != old_size && n % u64::from(TlogTile::FULL_WIDTH) != 0 {
if new_size != old_size && !new_size.is_multiple_of(u64::from(TlogTile::FULL_WIDTH)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants