Skip to content

feat: tiered data storage#7

Open
enigbe wants to merge 71 commits intomainfrom
2025-10-tiered-data-storage
Open

feat: tiered data storage#7
enigbe wants to merge 71 commits intomainfrom
2025-10-tiered-data-storage

Conversation

@enigbe
Copy link
Copy Markdown
Owner

@enigbe enigbe commented Oct 21, 2025

What this PR does:

We introduce TierStore, a KVStore implementation that manages data across
three distinct storage layers.

The layers are:

  1. Primary: The main/remote data store.
  2. Ephemeral: A secondary store for non-critical, easily-rebuildable data
    (e.g., network graph). This tier aims to improve latency by leveraging a
    local KVStore designed for fast/local access.
  3. Backup: A tertiary store for disaster recovery. Backup operations are sent
    asynchronously/lazily to avoid blocking primary store operations.

We also permit the configuration of Node with these stores allowing
callers to set exponential back-off parameters, as well as backup and ephemeral
stores, and to build the Node with TierStore's primary store. These configuration
options also extend to our foreign interface, allowing bindings target to build the
Node with their own ffi::KVStore implementations.

A sample Python implementation is added and tested.

Additionally, we add comprehensive testing for TierStore by introducing

  1. Unit tests for TierStore core functionality.
  2. Integration tests for Node built with tiered storage.
  3. Python FFI tests for foreign ffi::KVStore implementations.

Concerns

It is worth considering the way retry logic is handled, especially because of nested
retries. TierStore comes with a basic one by default but there are KVStore implementations
that come with them baked-in (e.g. VssStore), and thus would have no need for
the wrapper-store's own logic.

@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 9 times, most recently from 29f47f3 to 264aa7f Compare November 4, 2025 22:07
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from a30cbfb to 1e7bdbc Compare December 4, 2025 23:30
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 5 times, most recently from b5e980f to 67d47c2 Compare February 4, 2026 16:28
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from 95285b0 to 4b2d345 Compare February 18, 2026 11:23
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 2 times, most recently from cba29a3 to db1fe83 Compare February 24, 2026 23:03
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from db1fe83 to 35f9ec2 Compare March 9, 2026 15:47
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from e89ada5 to 3abd0a5 Compare April 1, 2026 21:44
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 3abd0a5 to 4a75d67 Compare April 2, 2026 09:55
Resolve Node.js 20 deprecation warnings by updating all GitHub Actions
to their latest major versions supporting Node.js 24.

Co-Authored-By: HAL 9000
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch 3 times, most recently from 5647237 to 04855d7 Compare April 6, 2026 06:05
tnull and others added 18 commits April 28, 2026 13:18
The parallel `JoinSet`-based batching loop was duplicated across
`read_payments` and `read_pending_payments`. Extract it into a generic
`read_all_objects<T: Readable>` helper that callers invoke directly
with the relevant namespace constants. Per-type log messages are
preserved via `std::any::type_name::<T>()`.

Co-Authored-By: HAL 9000
Bump test dependencies, reorganize Docker files, and add interop
integration test infrastructure with shared scenario runner for CLN,
LND, and Eclair.
Update to use the new HeaderCache type instead of implementing the
Cache trait, pass BestBlock instead of BlockHash to
synchronize_listeners, and pass HeaderCache by value to SpvClient::new.

Also adapt to BestBlock gaining a previous_blocks field and
ChannelManager deserialization returning BestBlock instead of BlockHash.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UniFFI cannot represent the fixed-size array that upstream's BestBlock
carries via `previous_blocks`, so NodeStatus.current_best_block was
unusable from Swift, Kotlin, and Python once upstream added that field.
Introduce a small ldk-node BestBlock with just hash and height — the
pieces bindings can handle — and expose it in place of the upstream
type on the public API.

Generated with assistance from Claude Code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DRY up batched `KVStore` reads utility methods
Add interop integration test harness for LND, CLN, and Eclair
…-api

Adapt to `lightning-block-sync` API changes
…o-node-status

Add a network field to NodeStatus
The max_inbound_htlc_value_in_flight_percent_of_channel config setting
was used when acting as an LSPS2 service in order to forward the initial
payment. However, upstream divided the config setting into two for
announced and unannounced channels, the latter defaulting to 100%.
…-flight

Drop `max_inbound_htlc_value_in_flight_percent_of_channel `
Note that we still don't expect to receive multiple outgoing HTLCs
because trampoline has not yet been enabled, but we lay the groundwork
here.
Update rust-lightning to use default_value_vec
This matches the logging done when setting up a VSS store.
In the mutual close case, before moving to check the expected balances
of each node, we now assert that the mutual close transaction actually
made it into a block. If the mutual close transaction got rejected for
some reason, we now stop the test right there and fail instead of
continuing onto balance checks.
…ll-request/patch

Automated nightly rustfmt (2026-05-03)
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from dec4aa7 to 421f0cb Compare May 4, 2026 09:09
tnull and others added 5 commits May 4, 2026 13:22
…le-mutual-close-coverage

Make sure the mutual close gets confirmed in `do_channel_full_cycle`
.. we were erroneously logging the `NODE_METRICS` namespaces.
This internal method allows us to avoid instantiating the logger twice
during the creation of a `Node`.
…persistence-namespaces

Fix logged namespaces in case of persistence failure
…testore-setup-err

Log the error returned from `SqliteStore::new` and `fs::create_dir_all`
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from 421f0cb to c1563e3 Compare May 6, 2026 07:21
enigbe added 3 commits May 6, 2026 08:25
This commit:

Adds `TierStore`, a tiered `KVStore`/`KVStoreSync` implementation that
routes node persistence across three storage roles:

- a primary store for durable, authoritative data
- an optional backup store for a second durable copy of primary-backed data
- an optional ephemeral store for rebuildable cached data such as the
  network graph and scorer

TierStore routes ephemeral cache data to the ephemeral store when
configured, while durable data remains primary+backup. Reads and lists
do not consult the backup store during normal operation.

For primary+backup writes and removals, this implementation treats the
backup store as part of the persistence success path rather than as a
best-effort background mirror. Earlier designs used asynchronous backup
queueing to avoid blocking the primary path, but that weakens the
durability contract by allowing primary success to be reported before
backup persistence has completed. TierStore now issues primary and backup
operations together and only returns success once both complete.

This gives callers a clearer persistence guarantee when a backup store is
configured: acknowledged primary+backup mutations have been attempted
against both durable stores. The tradeoff is that dual-store operations
are not atomic across stores, so an error may still be returned after one
store has already been updated.

TierStore also implements `KVStoreSync` in terms of dedicated synchronous
helpers that call the wrapped stores' sync interfaces directly. This
preserves the inner stores' synchronous semantics instead of routing sync
operations through a previously held async runtime.

Additionally, adds unit coverage for the current contract, including:
- basic read/write/remove/list persistence
- routing of ephemeral data away from the primary store
- backup participation in the foreground success path for writes and removals
Add native builder support for tiered storage by introducing
`TierStoreConfig` and builder methods for configuring backup and
ephemeral stores.

During node construction, wrap the configured primary store in
`TierStore` and attach any configured secondary tiers: ephemeral storage
for cache-like data and backup storage for mirrored durable writes.
Refactor backup storage to local SQLite

Replaces the builder's BYO backup-store configuration with a
path-based local SQLite backup mirror. The builder now constructs the
backup store internally using a dedicated backup database file name and
rejects configurations where the backup path conflicts with the primary
storage path.

Also adds test coverage for full-cycle backup mirroring and same-path
rejection, as well as a `setup_node_with_builder` test helper to allow
builder customization in integration tests.
@enigbe enigbe force-pushed the 2025-10-tiered-data-storage branch from c1563e3 to a2458e4 Compare May 6, 2026 07:37
- Make setup_builder! use a mutable binding for Builder under uniffi to
  preserve test helper compatibility for the FFI-backed builder
- Add ArcedNodeBuilder forwarding methods set_backup_storage_dir_path
  and set_ephemeral_store

Co-authored-by: Copilot <copilot@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.