Skip to content

feat: config validation, enum CHECK constraints, API rate limiting#163

Merged
user1303836 merged 2 commits intomainfrom
fix/medium-issues-152-146-148
Mar 13, 2026
Merged

feat: config validation, enum CHECK constraints, API rate limiting#163
user1303836 merged 2 commits intomainfrom
fix/medium-issues-152-146-148

Conversation

@user1303836
Copy link
Owner

Summary

Closes #152, closes #146, closes #148.

Test plan

  • cargo fmt --all --check passes
  • cargo clippy --workspace --all-targets -- -D warnings passes
  • cargo test --workspace passes (939 tests, 0 failures)
  • Verify CHECK constraints apply correctly against a running database
  • Verify rate limiting returns 429 when bucket is exhausted

🤖 Generated with Claude Code

Add config validation with ConfigError type and validate() method
checking database_url, port, and pool_size constraints (#152).

Add migration with CHECK constraints on enum-like TEXT columns:
ingestion_runs.status/mode, dataset_versions.status,
wallet_ledger.entry_type, networks.finality_model, and
hl_fills/hl_fill_records.side (#146).

Add in-memory per-key token bucket rate limiter to API endpoints
(60 req capacity, 10 tokens/sec refill) with rate_limit_middleware
applied after auth on all /v1/* routes (#148).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: aff67e64a446
Copy link
Owner Author

@user1303836 user1303836 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Config validation, enum CHECK constraints, API rate limiting

I reviewed the full diff and cross-referenced each change against the existing codebase. There are several issues that should be addressed before merging.


1. Config Validation (#152) — validate() is never called

Severity: High — the fix is incomplete.

AppConfig::validate() is defined and well-tested, but it is never called in any production code path. Neither api/src/main.rs nor cli/src/main.rs invokes config.validate() after AppConfig::load(). The validation is dead code.

In api/src/main.rs line 294:

let config = AppConfig::load()?;
// validate() is never called here — invalid configs proceed to connect to the DB

The fix should call config.validate()? immediately after AppConfig::load() in both the API main() and in the CLI main(). Since AppConfig::load() returns Result<Self, Box<figment::Error>> and validate() returns Result<(), ConfigError>, you will also need to either:

  • Change load() to call validate() internally and return ConfigError, or
  • Call validate() at each call site with an appropriate error conversion.

The ConfigError type and From<Box<figment::Error>> conversion are already prepared for this, suggesting the intent was to integrate them — but the integration was not completed.


2. Enum CHECK Constraints (#146) — Missing coverage and not idempotent

2a. wallet_ledger.entry_type includes 'funding', which is not in EntryType enum (Minor, acceptable)

The EntryType enum in core/src/models.rs has 5 variants: Trade, Fee, Transfer, Staking, Income. The CHECK constraint includes a 6th value 'funding'. I verified that "funding" is used in adapters/src/ledger_derivation.rs (line 722) for Hyperliquid funding payments written to wallet_ledger. This is correct — the V1 EntryType model is out of date relative to the Gold-tier wallet_ledger table, and the CHECK constraint correctly covers the actual domain. No issue here.

2b. Missing CHECK constraint on hl_position_changes.side (Medium)

The migration adds CHECK constraints on hl_fills.side and hl_fill_records.side but omits hl_position_changes.side, which is also a TEXT NOT NULL column with the same "B"/"A" domain (see migrations/20260310000000_add_hl_silver_datasets.sql line 94 and the dedup index on line 116). This column should have the same constraint.

2c. The migration is not idempotent (Medium)

The ALTER TABLE ... ADD CONSTRAINT statements will fail if the migration is run against a database where the constraints already exist. PostgreSQL does not support IF NOT EXISTS for ADD CONSTRAINT natively, but the standard pattern for defensive migrations is to wrap each statement in a DO $$ ... $$ block that checks pg_constraint first, or to use DROP CONSTRAINT IF EXISTS before ADD CONSTRAINT. This matters for deployments where migrations might be partially applied or replayed.

2d. Existing data risk not addressed (Minor)

The migration has no comment or safeguard acknowledging that existing data could violate these new CHECK constraints. If a database has rows with values outside the allowed set (e.g., a misspelled status or unexpected entry_type), the migration will fail. A DO $$ block that first scans for violations and logs them, or at minimum a comment noting the assumption, would be prudent.


3. Rate Limiting (#148) — Memory leak risk and no dedicated tests

3a. Unbounded HashMap growth (High — potential memory leak)

The RateLimiter.buckets HashMap grows without bound. Every unique Bearer token that hits the API creates a permanent entry. In production, this could leak memory over time if tokens rotate, or an attacker could exhaust memory by sending requests with many distinct Authorization headers.

The fix should include a periodic cleanup of stale entries (e.g., a background task that prunes buckets that have been idle beyond a TTL), or at minimum a size cap on the HashMap. A common pattern is a background tokio::spawn that runs every N minutes and removes buckets where Instant::now() - last_refill > TTL.

3b. Middleware ordering is correct (Confirmed)

In Axum, .layer() calls wrap in reverse order — the last .layer() added runs first on the inbound path. The rate limiter is added before require_auth:

.layer(rate_limit_middleware)  // added second — runs second (after auth)
.layer(require_auth)           // added last — runs first

This means auth runs first, then rate limiting. The docstring on rate_limit_middleware correctly states "Runs after auth so the key is already validated." The ordering is correct.

3c. "anonymous" fallback key is technically unreachable but still a code smell (Low)

Because require_auth runs first and rejects requests without a valid Bearer token, the unwrap_or("anonymous") fallback in rate_limit_middleware should never be reached in practice. However, having an "anonymous" bucket key that multiple unauthenticated requests could share (if auth were ever skipped for certain routes) is a latent risk. Consider returning an error or using a unique identifier instead of a shared fallback key.

3d. No dedicated rate limiting tests (Medium)

There are no tests that verify rate limiting behavior — no test exhausts the bucket and checks for a 429 response, no test verifies that tokens refill over time, and no test confirms that different keys have independent buckets. The RateLimiter is included in test setup (via AppState), but its behavior is never exercised.

At minimum, add:

  • A unit test for RateLimiter::try_acquire that exhausts the bucket and verifies rejection
  • A unit test that verifies refill over time (using tokio::time::pause/advance)
  • An integration test that sends 61+ requests and expects a 429

3e. Defaults are reasonable (Confirmed)

60 requests capacity with 10 tokens/sec refill means a burst of 60 requests is allowed, and sustained throughput is 10 req/s. This is a reasonable default for an API that runs ingestion and normalization jobs.

3f. Missing Retry-After header (Low)

HTTP 429 responses conventionally include a Retry-After header. While not strictly required, it would improve client experience. The token bucket could compute the expected wait time as (1.0 - bucket.tokens) / refill_rate seconds.


Summary

Issue Severity Action Required
validate() never called High Must call at startup
Unbounded HashMap in RateLimiter High Add cleanup/eviction
Missing hl_position_changes.side CHECK Medium Add constraint
Migration not idempotent Medium Add guards
No rate limit tests Medium Add tests
"anonymous" fallback key Low Consider removing
No Retry-After header Low Nice to have
Existing data risk not documented Low Add comment

Verdict: Request changes for the two High-severity items and the missing migration coverage.

- Call config.validate() after AppConfig::load() in API server startup
  so invalid config fails fast instead of leaving validate() as dead code
- Add last_used timestamp and eviction logic to RateLimiter to prevent
  unbounded HashMap growth from unique Bearer tokens
- Add missing CHECK constraint on hl_position_changes.side
- Wrap all ADD CONSTRAINT statements in DO/EXCEPTION blocks for
  idempotent migration reruns
- Add 6 unit tests for RateLimiter: capacity, rejection, refill,
  per-key isolation, eviction tracking, and last_used updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: cccef61a79a6
@user1303836 user1303836 merged commit 1203f92 into main Mar 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant