Make LogEntry::Metadata a generic associated type#210
Make LogEntry::Metadata a generic associated type#210lukevalenta wants to merge 1 commit intomainfrom
Conversation
|
/bonk review |
There was a problem hiding this comment.
Review Summary
This is a well-structured generalization that opens the door for future MTC metadata types without disrupting existing applications. The design is clean and the backward-compatibility story is solid.
What I like
Trait design — The LogEntry::Metadata associated type with make_metadata(leaf_index, timestamp, old_tree_size, new_tree_size) is a good API. Passing all four sequencer-computed values and letting each application pick what it needs is future-proof without being overly complex.
Binary format preservation — The CacheSerialize trait with a hand-rolled binary impl for SequenceMetadata preserves the existing 32-byte wire format in Durable Object storage. The test_sequence_metadata_cache_format_unchanged regression test is an excellent addition — this is exactly the kind of thing that catches subtle corruption bugs before they reach production.
impl_json_cache_serialize! macro — Good escape hatch for new metadata types that don't need a custom binary format. The JSON fallback is the right default for types that aren't performance-critical in the dedup cache path.
Invariant assertion — assert_eq!(n, new_size, ...) at line 929 of log_ops.rs is a welcome safety net. Computing new_size upfront and asserting it matches the loop counter makes the sequencing logic easier to reason about.
Minor observations (non-blocking)
-
CacheSerializedeserialization error type — The trait usesResult<..., String>as the error type. This works fine since callers convert viaInto<worker::Error>, but a dedicated error type orBox<dyn Error>would be more idiomatic Rust. Not worth changing now, just noting for future reference. -
Bound repetition — The
M: CacheSerialize + Serialize + DeserializeOwned + Clone + Copy + Default + 'staticbound appears on several impl blocks (DedupCache<M>,CacheWrite for DedupCache<M>,CacheRead for DedupCache<M>,GenericBatcher<M>). A trait alias or a supertrait onCacheSerializethat bundles these constraints would reduce duplication, though this is purely cosmetic. -
dedup_cache_fifo_keyextraction — The free functiondedup_cache_fifo_keywas extracted to supportcompute_cache_keys_to_loadbeing called without aDedupCacheinstance (sinceDedupCacheis now generic). This is a clean solution. TheDedupCache::fifo_keymethod delegates to it, keeping the two in sync.
Tests
All existing tests pass. New tests cover the JSON serialization path (test_serialize_deserialize_json_roundtrip, test_deserialize_invalid_json) and the binary format regression (test_sequence_metadata_cache_format_unchanged). Clippy is clean with -D warnings.
LGTM.
|
I've reviewed and approved PR #210. Here's a summary: Verdict: Approved The PR cleanly generalizes
All tests pass and clippy is clean. |
d173bd4 to
e6574ec
Compare
Adds a Metadata associated type to the LogEntry trait in tlog_tiles, allowing each tlog application to define its own sequence metadata type rather than sharing the fixed (LeafIndex, UnixTimestamp) tuple. tlog_tiles: - Adds LogEntry::Metadata: Serialize + DeserializeOwned + Copy + Default - Adds LogEntry::make_metadata(leaf_index, timestamp, old_tree_size, new_tree_size) to construct the metadata from raw sequencer values - TlogTilesLogEntry::Metadata = SequenceMetadata (unchanged behavior) - SequenceMetadata type alias is preserved for backward compatibility - Improve make_metadata doc comment: clarify that SequenceMetadata ignores tree sizes while other implementations may use all four args static_ct_api, bootstrap_mtc_api: - Metadata = SequenceMetadata; make_metadata ignores tree sizes generic_log_worker: - All cache infrastructure (DedupCache, MemoryCache, CacheRead, CacheWrite) is now generic over M: the metadata type - serialize_entries/deserialize_entries switched from fixed 32-byte binary format to serde_json (the binary format only worked for the fixed-size (u64, u64) tuple) - PoolState<P, M>, AddLeafResult<M>, add_leaf_to_pool, sequence_entries, GenericSequencer, GenericBatcher all generified over M - sequence_entries calls L::make_metadata(n, timestamp, old_size, new_size) - Assert n == new_size after the sequencing loop - CheckpointCallbacker gains old_tree_size and new_tree_size parameters so callbacks can identify which subtrees were covered by each batch
e6574ec to
0777cec
Compare
| use worker::*; | ||
|
|
||
| pub struct GenericBatcher { | ||
| pub struct GenericBatcher<M> { |
There was a problem hiding this comment.
Would be good to comment what M is. Or even better the whole struct if you feel like it
|
|
||
| // A batch of entries to be submitted to the Sequencer together. | ||
| struct Batch { | ||
| struct Batch<M> { |
|
|
||
| // A fixed-size in-memory FIFO cache. | ||
| struct MemoryCache { | ||
| struct MemoryCache<M> { |
|
|
||
| // Stage leftover partial data tile, if any. | ||
| if n != old_size && n % u64::from(TlogTile::FULL_WIDTH) != 0 { | ||
| if new_size != old_size && !new_size.is_multiple_of(u64::from(TlogTile::FULL_WIDTH)) { |
Adds a Metadata associated type to the LogEntry trait in tlog_tiles,
allowing each tlog application to define its own sequence metadata type
rather than sharing the fixed (LeafIndex, UnixTimestamp) tuple.
tlog_tiles:
new_tree_size) to construct the metadata from raw sequencer values
static_ct_api, bootstrap_mtc_api:
generic_log_worker:
CacheWrite) is now generic over M: the metadata type
binary format to serde_json (the binary format only worked for the
fixed-size (u64, u64) tuple)
GenericSequencer, GenericBatcher all generified over M
so callbacks can identify which subtrees were covered by each batch