Skip to content

test(replication): verify per-record expiresAt eviction across nodes#182

Open
kriszyp wants to merge 2 commits into
mainfrom
kris/upbeat-northcutt-14cddc
Open

test(replication): verify per-record expiresAt eviction across nodes#182
kriszyp wants to merge 2 commits into
mainfrom
kris/upbeat-northcutt-14cddc

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 20, 2026

Summary

  • Adds v5→v5 multi-record batch TTL eviction test to replicationTopology suite
  • Adds v4→v5 cross-version TTL eviction test to the existing legacy replication test

Background

Field report: records arriving via cross-version replication from v4 peers don't evict on v5 receivers, even past their expiresAtTimestamp. Local writes on the same v5 node evict correctly.

Root causes found during investigation:

1. Harper core (v5) — batched transaction context bug (HarperFast/harper#639, #640)

In a multi-record replication batch, every event after the first is processed with the first event as context. The options object passed to _writeUpdate didn't include expiresAt, and _writeUpdate only read from context, so subsequent records in a batch received no expiration. scheduleCleanup() was also only armed when context.expiresAt was truthy, missing replicated writes.

2. harperdb v4 — table copy encoding bug (harperdb/harperdb#3120)

When a new node joins a v4 cluster, the table copy path calls createAuditEntry with entry.metadataFlags & ~0xff as extendedType. This masks HAS_EXPIRATION (0x10, lower byte) without promoting it to HAS_EXPIRATION_EXTENDED_TYPE (0x1000), so the expiresAt float64 is never encoded into the binary payload. Receivers see auditRecord.expiresAt === undefined for every table-copied record.

Test design

v5→v5 test: Creates a 1-second TTL table on node 0, upserts 4 records in one batch (exercising the batched-txn path), waits for replication confirmation, then asserts all 4 records are gone from every node after TTL+buffer.

v4→v5 test: Pre-creates the table on the v5 cluster with the same expiration value before connecting to the v4 node (workaround for the v4 table-copy encoding bug — the fallback expirationMs + Date.now() on the v5 receiver computes a fresh expiresAt at receipt time). Writes records on the v4 node, waits for replication, asserts eviction.

Dependencies

  • Requires HarperFast/harper#639 for the v5→v5 and v4→v5 tests to pass reliably (core submodule bump needed after merge)

Signed-off by Claude

Adds two test cases to the replicationTopology integration suite:

1. v5→v5 multi-record batch TTL: writes 4 records in a single upsert
   (which the replication layer batches as one transaction) to a table
   with a 1-second expiration, then asserts all nodes evict every record
   after the TTL window. This exercises the fix for the batched-txn
   expiresAt propagation bug where only the first record's context was
   used for all subsequent records.

2. v4→v5 cross-version TTL: pre-creates a table on the v5 cluster with
   the same expiration as the v4 table (workaround for v4's table-copy
   encoding bug), writes records on the v4 node, waits for replication,
   then asserts eviction occurs on all v5 receivers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp requested a review from a team as a code owner May 20, 2026 16:04
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 20, 2026

Reviewed; no blockers found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants