Skip to content

backend(eventIndexer): no alerting or circuit breaker when indexer falls more than N ledgers behind chain tip #665

@ogazboiz

Description

@ogazboiz

Join our community: https://t.me/+DOylgFv1jyJlNzM0

Description

The event indexer in backend/src/services/eventIndexer.ts logs its progress but there is no mechanism to alert operators when it falls significantly behind the chain tip. If the indexer stalls (RPC timeout, DB pressure, process restart), it could be hundreds or thousands of ledgers behind with no visible warning.

During the gap period, all loan-related UI data is stale: repayments are not reflected, defaults are not triggered, and score updates are not applied.

Expected Behavior

Add a lag threshold check to the indexer health endpoint or expose a metric:

  • Calculate lagLedgers = chainTipLedger - lastIndexedLedger
  • If lagLedgers > ALERT_THRESHOLD (e.g., 500 ledgers), log a warn and return a degraded status in the health check endpoint
  • Expose this metric via the /health or /api/indexer/status endpoint so it can be monitored

Suggested Fix

Add to the existing indexer status endpoint:

const lag = currentChainLedger - lastIndexedLedger;
if (lag > 500) logger.warn(`Indexer is ${lag} ledgers behind chain tip`);
return { lastIndexedLedger, currentChainLedger, lag, status: lag > 500 ? 'degraded' : 'healthy' };

Impact

Medium. Without lag monitoring, indexer issues can go unnoticed for hours. A 1-hour gap at 5-second ledger time = 720 missed ledgers = potentially dozens of missed events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions