Skip to content

Formal Verification

Anup Ghatage edited this page Feb 12, 2026 · 1 revision

Formal Verification

Zeppelin uses TLA+ to formally verify correctness of its concurrent algorithms. These specs have found real bugs and proven key invariants before deployment.

Why TLA+

Distributed systems with concurrent writes, compaction, and queries have subtle race conditions that are nearly impossible to find through testing alone. TLA+ exhaustively explores all possible interleavings to prove or disprove invariants.

Key results:

  • Found real bugs in compaction safety and query read consistency before they hit production
  • Proved correctness of the multi-writer lease protocol, bitmap filter algebra, and BM25 query consistency
  • 26,000+ states explored across 15 specifications

Specifications

Compaction & WAL Safety

Spec Invariants States What it verifies
CompactionSafety.tla 5 ~5,000 No data loss during WAL → segment compaction. Concurrent query sees consistent view. Deferred deletion is safe.
QueryReadConsistency.tla 4 ~3,000 Strong queries never miss committed writes. WAL overrides segment on conflict. Deleted vectors excluded.
MultiWriterLease.tla 4 24,965 Fencing tokens prevent zombie writes. Two-layer defense (fencing + CAS) is both necessary and sufficient.

BM25 Full-Text Search

Spec Invariants States What it verifies
BM25QueryConsistency.tla 5 272 Committed docs never missed. WAL overrides segment on merge. Deleted docs never returned. Concurrent write + compaction + query.
InvertedIndexBuildSafety.tla 4 139 Posting lists cover all docs. IDF stats consistent. Compaction preserves non-deleted data.
BM25ScoreCommutativity.tla 5 5 Sum commutativity, WAL vs segment scoring equivalence, Product associativity, TopK stability under tie-breaking.

Bitmap Pre-Filters

Spec Invariants States What it verifies
BitmapFilterAlgebra.tla 6 324 And/Or/Not/Eq/Range/In filter evaluation matches brute-force. Bitmap intersection = logical AND.
BitmapBuildCoverage.tla 4 65 Build process produces correct bitmaps from attribute data. Numeric keys sorted. High-cardinality excluded.
BitmapSearchEquivalence.tla 3 8 Bitmap-accelerated search returns same results as post-filter search.

Hierarchical ANN

Spec Invariants States What it verifies
HierarchicalVectorCoverage.tla 4 263 Partition correctness: every vector reachable, no duplicates across clusters.
BeamSearchTermination.tla 5 30 Beam search terminates on both balanced and unbalanced trees. Monotonic progress.
HierarchicalBuildAtomicity.tla 3 34 Concurrent build + query safety. Query sees consistent snapshot.

Core Infrastructure

Spec Invariants States What it verifies
WalFragmentIntegrity.tla 3 ~100 Fragment checksums detect corruption. Write-once semantics.
CASManifestSafety.tla 3 ~200 ETag-based CAS prevents concurrent manifest corruption.
CacheEvictionSafety.tla 3 ~150 LRU eviction never evicts pinned centroids. Cache coherence.

Key Findings

1. Two-layer defense is necessary

The MultiWriterLease.tla spec proved that neither fencing alone nor CAS alone prevents zombie writes:

  • Fencing without CAS: A stale writer can pass the fencing check, then a new writer acquires the lease and writes, then the stale writer's write arrives — TOCTOU race.
  • CAS without fencing: A stale writer can read the current manifest, compute new state, and win the CAS race because its ETag is still valid.
  • Both together: The fencing check catches most stale writers early, and CAS catches the remaining TOCTOU gap.

2. Deferred deletion prevents phantom reads

The CompactionSafety.tla spec showed that deleting old segments before committing the new manifest can cause phantom reads — a concurrent query reads the old manifest, then tries to fetch a segment that was already deleted.

Solution: Write new segment → CAS manifest → Delete old segment (deferred deletion).

3. Bitmap filter algebra is correct

The BitmapFilterAlgebra.tla spec exhaustively verified that bitmap evaluation (And = intersection, Or = union, Not = complement) produces identical results to brute-force filter evaluation across all 324 possible states with mixed attribute types.

4. BM25 scores are storage-independent

The BM25ScoreCommutativity.tla spec verified that a document's BM25 score is the same whether the document is stored in the WAL or in a segment. This is crucial for correct merge behavior during compaction.

Running TLC

Prerequisites:

  • Java 21+ (OpenJDK)
  • TLA+ tools (tla2tools.jar)
cd formal-verifications/tla

# Run a specific spec
/opt/homebrew/opt/openjdk@21/bin/java \
  -jar ../tla2tools.jar \
  -config MultiWriterLease.cfg \
  MultiWriterLease.tla

# All specs use CHECK_DEADLOCK FALSE (terminal states are expected)

Each .tla file has a corresponding .cfg file that defines:

  • Constants and their values
  • Invariants to check
  • CHECK_DEADLOCK FALSE (since terminal states are expected in these models)

Clone this wiki locally