Problem
Sync failures are currently invisible until a user notices missing data: no metrics for sync lag, catch-up rounds, per-peer ack-age, relay bytes, or divergence repairs. Operators and the consuming app cannot tell whether sync is healthy or quietly broken.
Proposed approach (Phase 3; pairs with the relay-cost telemetry of #TELEMETRY)
Add counters/gauges and expose them via NetworkEvent / diagnostics:
- replication/sync lag and time-since-last-converged (per peer)
- catch-up + RBSR round counts, divergence-repair count
- per-peer last-acked cursor age
- relay-byte trend (shared with the relay-cost telemetry issue)
Files
wavesyncdb/src/engine/mod.rs, wavesyncdb/src/network_status.rs (NetworkEvent), wavesyncdb/src/diagnostics.rs (or a new metrics.rs).
Ref: docs/research/sync-reliability.md §6 P2 / §5 (convergence verification & observability).
Problem
Sync failures are currently invisible until a user notices missing data: no metrics for sync lag, catch-up rounds, per-peer ack-age, relay bytes, or divergence repairs. Operators and the consuming app cannot tell whether sync is healthy or quietly broken.
Proposed approach (Phase 3; pairs with the relay-cost telemetry of #TELEMETRY)
Add counters/gauges and expose them via
NetworkEvent/diagnostics:Files
wavesyncdb/src/engine/mod.rs,wavesyncdb/src/network_status.rs(NetworkEvent),wavesyncdb/src/diagnostics.rs(or a newmetrics.rs).Ref:
docs/research/sync-reliability.md§6 P2 / §5 (convergence verification & observability).