feat: config validation, enum CHECK constraints, API rate limiting by user1303836 · Pull Request #163 · user1303836/spectraplex

user1303836 · 2026-03-13T17:30:14Z

Summary

Config validation (enhancement: config validation missing for database_url, port, pool_size #152): Add ConfigError enum and validate() method to AppConfig checking database_url (non-empty), port (non-zero), and pool_size (1-1000). Includes 8 unit tests covering boundary and error cases.
Enum CHECK constraints (bug: missing CHECK constraints on enum columns allow invalid values in DB #146): New migration adding CHECK constraints on TEXT columns storing enum-like values: ingestion_runs.status/mode, dataset_versions.status, wallet_ledger.entry_type, networks.finality_model, hl_fills.side, and hl_fill_records.side.
API rate limiting (enhancement: no rate limiting on API endpoints #148): In-memory per-key token bucket rate limiter (60 request capacity, 10 tokens/sec refill) applied to all authenticated /v1/* endpoints via rate_limit_middleware.

Closes #152, closes #146, closes #148.

Test plan

cargo fmt --all --check passes
cargo clippy --workspace --all-targets -- -D warnings passes
cargo test --workspace passes (939 tests, 0 failures)
Verify CHECK constraints apply correctly against a running database
Verify rate limiting returns 429 when bucket is exhausted

🤖 Generated with Claude Code

Add config validation with ConfigError type and validate() method checking database_url, port, and pool_size constraints (#152). Add migration with CHECK constraints on enum-like TEXT columns: ingestion_runs.status/mode, dataset_versions.status, wallet_ledger.entry_type, networks.finality_model, and hl_fills/hl_fill_records.side (#146). Add in-memory per-key token bucket rate limiter to API endpoints (60 req capacity, 10 tokens/sec refill) with rate_limit_middleware applied after auth on all /v1/* routes (#148). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: aff67e64a446

user1303836

Review: Config validation, enum CHECK constraints, API rate limiting

I reviewed the full diff and cross-referenced each change against the existing codebase. There are several issues that should be addressed before merging.

1. Config Validation (#152) — `validate()` is never called

Severity: High — the fix is incomplete.

AppConfig::validate() is defined and well-tested, but it is never called in any production code path. Neither api/src/main.rs nor cli/src/main.rs invokes config.validate() after AppConfig::load(). The validation is dead code.

In api/src/main.rs line 294:

let config = AppConfig::load()?;
// validate() is never called here — invalid configs proceed to connect to the DB

The fix should call config.validate()? immediately after AppConfig::load() in both the API main() and in the CLI main(). Since AppConfig::load() returns Result<Self, Box<figment::Error>> and validate() returns Result<(), ConfigError>, you will also need to either:

Change load() to call validate() internally and return ConfigError, or
Call validate() at each call site with an appropriate error conversion.

The ConfigError type and From<Box<figment::Error>> conversion are already prepared for this, suggesting the intent was to integrate them — but the integration was not completed.

2. Enum CHECK Constraints (#146) — Missing coverage and not idempotent

2a. wallet_ledger.entry_type includes 'funding', which is not in EntryType enum (Minor, acceptable)

The EntryType enum in core/src/models.rs has 5 variants: Trade, Fee, Transfer, Staking, Income. The CHECK constraint includes a 6th value 'funding'. I verified that "funding" is used in adapters/src/ledger_derivation.rs (line 722) for Hyperliquid funding payments written to wallet_ledger. This is correct — the V1 EntryType model is out of date relative to the Gold-tier wallet_ledger table, and the CHECK constraint correctly covers the actual domain. No issue here.

2b. Missing CHECK constraint on hl_position_changes.side (Medium)

The migration adds CHECK constraints on hl_fills.side and hl_fill_records.side but omits hl_position_changes.side, which is also a TEXT NOT NULL column with the same "B"/"A" domain (see migrations/20260310000000_add_hl_silver_datasets.sql line 94 and the dedup index on line 116). This column should have the same constraint.

2c. The migration is not idempotent (Medium)

The ALTER TABLE ... ADD CONSTRAINT statements will fail if the migration is run against a database where the constraints already exist. PostgreSQL does not support IF NOT EXISTS for ADD CONSTRAINT natively, but the standard pattern for defensive migrations is to wrap each statement in a DO $$ ... $$ block that checks pg_constraint first, or to use DROP CONSTRAINT IF EXISTS before ADD CONSTRAINT. This matters for deployments where migrations might be partially applied or replayed.

2d. Existing data risk not addressed (Minor)

The migration has no comment or safeguard acknowledging that existing data could violate these new CHECK constraints. If a database has rows with values outside the allowed set (e.g., a misspelled status or unexpected entry_type), the migration will fail. A DO $$ block that first scans for violations and logs them, or at minimum a comment noting the assumption, would be prudent.

3. Rate Limiting (#148) — Memory leak risk and no dedicated tests

3a. Unbounded HashMap growth (High — potential memory leak)

The RateLimiter.buckets HashMap grows without bound. Every unique Bearer token that hits the API creates a permanent entry. In production, this could leak memory over time if tokens rotate, or an attacker could exhaust memory by sending requests with many distinct Authorization headers.

The fix should include a periodic cleanup of stale entries (e.g., a background task that prunes buckets that have been idle beyond a TTL), or at minimum a size cap on the HashMap. A common pattern is a background tokio::spawn that runs every N minutes and removes buckets where Instant::now() - last_refill > TTL.

3b. Middleware ordering is correct (Confirmed)

In Axum, .layer() calls wrap in reverse order — the last .layer() added runs first on the inbound path. The rate limiter is added before require_auth:

.layer(rate_limit_middleware)  // added second — runs second (after auth)
.layer(require_auth)           // added last — runs first

This means auth runs first, then rate limiting. The docstring on rate_limit_middleware correctly states "Runs after auth so the key is already validated." The ordering is correct.

3c. "anonymous" fallback key is technically unreachable but still a code smell (Low)

Because require_auth runs first and rejects requests without a valid Bearer token, the unwrap_or("anonymous") fallback in rate_limit_middleware should never be reached in practice. However, having an "anonymous" bucket key that multiple unauthenticated requests could share (if auth were ever skipped for certain routes) is a latent risk. Consider returning an error or using a unique identifier instead of a shared fallback key.

3d. No dedicated rate limiting tests (Medium)

There are no tests that verify rate limiting behavior — no test exhausts the bucket and checks for a 429 response, no test verifies that tokens refill over time, and no test confirms that different keys have independent buckets. The RateLimiter is included in test setup (via AppState), but its behavior is never exercised.

At minimum, add:

A unit test for RateLimiter::try_acquire that exhausts the bucket and verifies rejection
A unit test that verifies refill over time (using tokio::time::pause/advance)
An integration test that sends 61+ requests and expects a 429

3e. Defaults are reasonable (Confirmed)

60 requests capacity with 10 tokens/sec refill means a burst of 60 requests is allowed, and sustained throughput is 10 req/s. This is a reasonable default for an API that runs ingestion and normalization jobs.

3f. Missing Retry-After header (Low)

HTTP 429 responses conventionally include a Retry-After header. While not strictly required, it would improve client experience. The token bucket could compute the expected wait time as (1.0 - bucket.tokens) / refill_rate seconds.

Summary

Issue	Severity	Action Required
`validate()` never called	High	Must call at startup
Unbounded HashMap in RateLimiter	High	Add cleanup/eviction
Missing `hl_position_changes.side` CHECK	Medium	Add constraint
Migration not idempotent	Medium	Add guards
No rate limit tests	Medium	Add tests
`"anonymous"` fallback key	Low	Consider removing
No `Retry-After` header	Low	Nice to have
Existing data risk not documented	Low	Add comment

Verdict: Request changes for the two High-severity items and the missing migration coverage.

- Call config.validate() after AppConfig::load() in API server startup so invalid config fails fast instead of leaving validate() as dead code - Add last_used timestamp and eviction logic to RateLimiter to prevent unbounded HashMap growth from unique Bearer tokens - Add missing CHECK constraint on hl_position_changes.side - Wrap all ADD CONSTRAINT statements in DO/EXCEPTION blocks for idempotent migration reruns - Add 6 unit tests for RateLimiter: capacity, rejection, refill, per-key isolation, eviction tracking, and last_used updates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: cccef61a79a6

user1303836 commented Mar 13, 2026

View reviewed changes

user1303836 merged commit 1203f92 into main Mar 13, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: config validation, enum CHECK constraints, API rate limiting#163

feat: config validation, enum CHECK constraints, API rate limiting#163
user1303836 merged 2 commits intomainfrom
fix/medium-issues-152-146-148

user1303836 commented Mar 13, 2026

Uh oh!

user1303836 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

user1303836 commented Mar 13, 2026

Summary

Test plan

Uh oh!

user1303836 left a comment

Choose a reason for hiding this comment

Review: Config validation, enum CHECK constraints, API rate limiting

1. Config Validation (#152) — validate() is never called

2. Enum CHECK Constraints (#146) — Missing coverage and not idempotent

3. Rate Limiting (#148) — Memory leak risk and no dedicated tests

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Config Validation (#152) — `validate()` is never called