Skip to content

Testnet: Relay Load & Resilience Testing #120

@0xdevcollins

Description

@0xdevcollins

Summary

Stress-test the relay service under concurrent payments and simulate failure scenarios to ensure it recovers correctly.

Why This Matters

The relay is the most critical piece of infrastructure — it watches chain events and completes atomic swaps. If it crashes, misses events, or fails to reconnect, payments are stuck. This must be validated before any real money flows.

Test Scenarios

1. Concurrent payments (load)

  • Submit 10 Stellar-native payments simultaneously
  • Verify all 10 complete without race conditions on secret revelation
  • Check for duplicate withdraw() calls (should be idempotent)
  • Verify BullMQ job deduplication works correctly

2. Relay crash mid-payment

  • Start a payment, kill relay process after Locked event detected
  • Restart relay
  • Verify relay picks up in-progress payments from DB and completes them
  • Verify no double-spend or double-refund occurs

3. Stellar node disconnect

  • Simulate Stellar RPC going offline mid-payment
  • Verify relay uses exponential backoff reconnect
  • Verify no events missed after reconnect (replay from last known ledger)

4. Quote expiry race

  • Let a quote expire (30s) while payer is mid-checkout
  • Verify expired quote shows clear "Quote expired, get new rate" prompt
  • Verify no HTLC lock created for expired quote

5. Watchdog: expired lock refund

  • Create a Stellar HTLC lock manually
  • Wait for 60s watchdog cycle
  • After timelock expires, verify watchdog calls refund() and payment → REFUNDED

6. Webhook retry on failure

  • Configure webhook URL to return 500
  • Verify exponential backoff retry (attempt 1 immediately, 2 after 5min, 3 after 30min)
  • After 3 failures, status → EXHAUSTED
  • Manual retry via POST /v1/webhooks/logs/:id/retry succeeds

Pass Criteria

  • All 10 concurrent payments complete correctly
  • Relay survives crash + restart with no payment loss
  • Exponential backoff reconnect works on node disconnect
  • Expired quotes never result in a stuck HTLC lock
  • Watchdog refunds all expired locks within 2 minutes
  • Webhook retry exhaustion works as documented

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend API worktestingTests and validation

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions