Systems Architecture Principles

Principles from "Designing Data-Intensive Applications" and the System Design Primer, applied to daily engineering decisions — especially for frontend engineers who want to think beyond the UI layer.

When designing or reviewing data flows

Identify the source of truth for every piece of data. If you can't point to it, the system has a design flaw.
Optimistic updates are eventual consistency — always implement rollback for mutations that can fail.
Prefer idempotent operations. Trade submissions, payment requests, and state mutations should be safe to retry.
When the frontend makes 3+ API calls to render a single view, advocate for a BFF (Backend for Frontend) endpoint that aggregates the data server-side.

When working with real-time data (WebSockets, SSE, order books)

WebSocket feeds are change data capture (CDC) — the backend streams state deltas. Understand whether you're receiving snapshots or incremental updates.
Back pressure matters: if the server pushes events faster than the UI can render, throttle/debounce updates to the next animation frame.
Stale data is a feature, not a bug, in eventually consistent systems. Know the staleness budget for each data type (order book: milliseconds; account balance: seconds; historical trades: minutes).

When implementing caching (React Query, SWR, or otherwise)

Every cache entry needs an explicit invalidation strategy. "It'll just refetch" is not a strategy.
Stale-while-revalidate: serve stale data immediately, fetch fresh in background — React Query's default. Good for read-heavy, staleness-tolerant data.
Cache-aside (lazy loading): check cache → miss → fetch from source → populate cache. Most common pattern.
Write-through: update cache AND source simultaneously. Use when consistency matters more than write latency.
When debugging stale data, trace through ALL cache layers: browser, CDN, API gateway, application cache, database query cache.

When making API design decisions

REST for CRUD resources, RPC-style for complex actions (executeTrade, settleMarket), WebSocket for real-time streams.
Batch endpoints eliminate N+1 request patterns — fetch related resources in one call instead of N sequential calls.
Pagination is not optional for list endpoints. Cursor-based > offset-based for large, changing datasets.
API versioning strategy should be decided upfront, not bolted on.

When reasoning about consistency and availability

CAP theorem: in a network partition, you choose consistency (reject requests) or availability (serve potentially stale data). Know which your system chooses.
Read-after-write consistency: after a user submits a trade, they should immediately see it in their order history. If the API is eventually consistent, the frontend must fake this with local state.
Monotonic reads: a user should never see data go "backwards." If they saw their balance as $100, the next read shouldn't show $95 from a stale replica.

When discussing system design with backend engineers

Ask: "What happens when this service goes down?" (failure modes)
Ask: "What's the consistency model of this endpoint?" (strong vs eventual)
Ask: "Where is this data partitioned and how?" (scaling strategy)
Ask: "What's the write path vs the read path?" (CQRS, read replicas)
Ask: "Is this operation idempotent?" (retry safety)
Trace the full request lifecycle: frontend → load balancer → API gateway → service → database → response. Know every hop.

When reasoning about database design (even from the frontend)

Normalization reduces redundancy but requires joins (slower reads). Denormalization duplicates data but enables fast reads. Trading systems often denormalize for read performance.
Indexes make reads fast but slow down writes. If a read-heavy endpoint is slow, suggest adding an index.
Transactions ensure atomicity — multiple related writes either all succeed or all fail. When your frontend needs multiple related API calls, ask if there's a single transactional endpoint.

When thinking about scalability

Vertical scaling (bigger machine) has a ceiling. Horizontal scaling (more machines) requires stateless services.
If your API behaves inconsistently between requests, check if a load balancer is routing to different servers with different state.
Read-heavy systems scale with read replicas. Write-heavy systems scale with partitioning/sharding. Know which your system is.

When writing infrastructure code (interceptors, auth, middleware, providers)

Infrastructure code operates below the feature layer — every request, every user flows through it. Think adversarially:

What if this runs twice? Interceptors, retries, event handlers can fire multiple times. A retry that replays a trade submission creates a duplicate order.
What if two of these race? Two refresh calls, two reconnects, two state resets. Use deduplication (singleton promises, coordinators, cooldowns).
What if the first call succeeded but the response was lost? A 401 after a successful mutation means the mutation happened — retrying creates duplicates. Only auto-retry idempotent operations (GET).
What HTTP methods flow through this? An interceptor that treats GETs and POSTs the same is a bug waiting to happen.
One coordinator per operation — if multiple triggers (timer, visibility, error) all do the same thing, route through a single module with shared deduplication. Never have two independent code paths (React Query mutation + raw fetch) hitting the same endpoint.
Simple scheduling over clever scheduling — a fixed interval beats computed-expiry-with-timeout-fallback chains. Fewer moving parts = fewer edge cases.

When handling failures

Everything fails. Network requests, servers, databases, third-party APIs. Design flows that degrade gracefully.
Circuit breaker pattern: after N failures, stop trying for a cool-down period. Don't hammer a failing service.
Retry with exponential backoff + jitter. Never retry immediately, never retry forever.
Timeouts are mandatory for every external call. No timeout = potential infinite hang.

Key numbers to internalize

L1 cache reference: 0.5 ns
Main memory reference: 100 ns
SSD random read: 150 us
Round trip within same datacenter: 500 us
Disk seek: 10 ms
Read 1 MB sequentially from network: 10 ms
Round trip CA to Netherlands: 150 ms
A single server can handle ~10k-100k concurrent WebSocket connections

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Systems Architecture Principles

When designing or reviewing data flows

When working with real-time data (WebSockets, SSE, order books)

When implementing caching (React Query, SWR, or otherwise)

When making API design decisions

When reasoning about consistency and availability

When discussing system design with backend engineers

When reasoning about database design (even from the frontend)

When thinking about scalability

When writing infrastructure code (interceptors, auth, middleware, providers)

When handling failures

Key numbers to internalize

FilesExpand file tree

systems-architecture.md

Latest commit

History

systems-architecture.md

File metadata and controls

Systems Architecture Principles

When designing or reviewing data flows

When working with real-time data (WebSockets, SSE, order books)

When implementing caching (React Query, SWR, or otherwise)

When making API design decisions

When reasoning about consistency and availability

When discussing system design with backend engineers

When reasoning about database design (even from the frontend)

When thinking about scalability

When writing infrastructure code (interceptors, auth, middleware, providers)

When handling failures

Key numbers to internalize