Skip to content

Latest commit

 

History

History
1185 lines (1029 loc) · 77.9 KB

File metadata and controls

1185 lines (1029 loc) · 77.9 KB

AIProxy - Security Proxy for AI Agents

Overview

HTTP/HTTPS proxy with access control and WebUI, designed for small containerized installations to serve as a security gateway for AI agents. Single-machine deployment with simple admin-only WebUI.

Core Purpose

  • Terminate all HTTP/HTTPS connections
  • Allow access to known [method, URL] combinations via whitelist
  • Block requests based on blacklist
  • Global and per-rule rate control
  • Collect and persist statistics across restarts
  • Interactive access control with live rule addition and pending request handling

Technology Stack

  • Language: Go (idiomatic code, hundreds of concurrent connections)
  • Proxy Engine: goproxy
  • Configuration: go-flags (github.com/jessevdk/go-flags) - struct-based CLI flags + env vars
  • Logging: slog
  • WebUI: templ + HTML + htmx (real-time updates)
  • Container: Podman (no daemon mode required)
  • TLS: Self-signed certificate generation + TLS bumping
    • Algorithm: ECDSA P-256 (elliptic curve cryptography)
    • Validity: 10 years (balance between operational simplicity and security)
    • Key Size: 256-bit (equivalent to RSA 3072, secure beyond 2030)
  • Storage: JSON files (simple, no external dependencies)

Core Features

1. Proxy Functionality

  • HTTP/HTTPS proxy server (goproxy-based)
  • TLS interception (HTTPS bumping)
  • Certificate management:
    • Auto-generation of CA cert on first run if missing (files, not directory)
    • Certificates stored at the configured --tls-cert and --tls-key paths
    • WebUI endpoint to download CA certificate
    • Loading Modes:
      • Separate files (default): Certificate and key in different files
      • Combined file: Both certificate and key in single PEM file
      • Detection: If --tls-cert and --tls-key paths are identical, load as combined file
      • Combined file format: Certificate block first, then private key block
      • Auto-generation: Always creates separate files (combined mode only for loading)
    • Algorithm: ECDSA P-256 (faster than RSA, smaller keys, modern)
    • Validity Period: 10 years from generation
    • X.509 Extensions:
      • BasicConstraints: CA:TRUE, pathlen:0 (CRITICAL) - can sign certs but not sub-CAs
      • KeyUsage: Certificate Sign, CRL Sign (CRITICAL) - required for CA
      • SubjectKeyIdentifier: Auto-generated by Go from public key hash
    • Subject Naming: Organization="AIProxy CA", CommonName="AIProxy Self-Signed CA"
    • File Permissions: Private key (ca-key.pem): 0600, Certificate (ca-cert.pem): 0644
    • Validation: Strict by default (expiration, CA constraints, key usage, key pair matching, file permissions)
    • Security: Optional --insecure-certs flag allows validation errors to become warnings
  • Request/response inspection
  • Connection termination and forwarding
  • No proxy authentication (network-isolated deployment)
  • CONNECT method blocking (anti-tunneling):
    • Blocks CONNECT to non-443 ports (prevents arbitrary TCP tunnels)
    • Allows CONNECT to port 443 for HTTPS/TLS bumping
    • 1-second delay before returning error (rate-limits scanner behavior)
    • Returns HTTP 403 Forbidden with JSON error
    • Configurable delay via connectBlockDelay constant
    • No CLI flag to disable (security by design, test-only override)
    • Works in combination with TLS bumping (not mutually exclusive)
  • Localhost IP protection (SSRF prevention):
    • Blocks requests targeting 127.0.0.0/8 or ::1
    • 1-second delay before returning error (rate-limits scanner behavior)
    • Returns HTTP 403 Forbidden with JSON error
    • Configurable delay via localhostBlockDelay constant

2. Access Control

  • Request Flow:
    1. Check blacklist → Reject immediately (HTTP 403)
    2. Check whitelist → Allow
    3. Unknown request → Hold as pending (deduplicated)
    4. Pending timeout: 120 seconds (configurable)
    5. Expired pending → Reject as blacklisted
  • Rule Matching: Rich JSON rule objects matched against method, scheme, host, path, and port
    • Example: {"id": "allow-openai-get", "method": "GET", "scheme": "https", "host": "api.openai.com", "path": "/**"}
    • Example: {"id": "allow-openai-post", "method": "POST", "scheme": "https", "host": "api.openai.com"}
  • Blacklist Behavior:
    • Rules loaded from --blacklist-rules file at startup
    • Checked AFTER localhost blocking, BEFORE global rate limiting (Request Flow step 6)
    • First matching rule blocks the request immediately (HTTP 403)
    • No delay before rejection (unlike SSRF/CONNECT blockers which rate-limit scanners)
    • Logged at WARN level with structured attributes (request_id, method, url, matched_rule, remote_addr)
    • Missing file is not an error: proxy starts with empty blacklist (allow all by blacklist)
    • Static rules only in v1; runtime rules via WebUI added in Phase 6 (--rt-blacklist-rules)
  • Pending Request Management:
    • Unlimited queue size (until OOM)
    • Deduplication of identical pending requests
    • Persistence across restarts (pending.json)
    • Keep expired/timed-out requests for later admin review
    • Admin can create whitelist/blacklist rules from pending requests
  • Rule Storage (see Storage section)

3. Rate Control (Interval-Based)

  • Simple Interval-Based Rate Limiting:
    • If limit is 10 req/min: interval = 60s / 10 = 6 seconds between requests
    • First request → Process immediately
    • Subsequent request → Hold for remaining interval time (if needed)
    • Example: 10 rpm = minimum 6 seconds between requests to same endpoint
  • Global Rate Limit: Default req/min (configurable in config.json)
  • Per-Rule Rate Limit: Optional override in whitelist rules
  • Rate Limiting Scope:
    • Track per-rule (matched whitelist rule determines rate limit)
    • Stats granularity follows rule granularity
  • Rate Limit State:
    • In-memory tracking (reset on restart acceptable)
    • Track: last request timestamp per rule
    • Calculate delay: max(0, interval - time_since_last_request)
  • WebUI Rate Control Viewer:
    • Separate htmx page showing requests currently held by rate control
    • Display: request details, matched rule, hold time remaining
    • Real-time updates via SSE

4. Statistics & Monitoring

  • Per-Rule Stats (persisted to stats.json):
    • Stats granularity matches rule definitions
    • Track stats per matched rule (not per unique URL)
    • Total request count per rule
    • First request timestamp
    • Last request timestamp
    • Example: Rule GET https://api.openai.com/** tracks all matching requests as one stat
  • Application Log (slog, text format):
    • Default: stdout
    • Optional file output with rotation via lumberjack: --log-file
    • Rotation: size-based with configurable max size, max age, max backups
    • Configurable: --log-max-size (default 10 MB), --log-max-backups (default 3), --log-max-age (default 0 = no limit)
  • Access Log (Apache/Nginx-style, TODO):
    • Fixed hardcoded format (no customization needed)
    • Log rotation: size-based, shares lumberjack rotation with application log (max size, max age, max backups)
    • Log format: {timestamp} {client_ip} {method} {url} {status} {response_time_ms} {action} {matched_rule}
    • Action values: allowed, blocked_blacklist, blocked_timeout, rate_limited
    • Text-based format (not JSON) for easy grepping
    • Example line: 2026-03-27T10:15:30Z 10.0.0.5 GET https://api.openai.com/v1/chat 200 450ms allowed rule:whitelist[5]
  • Real-Time Stats for WebUI:
    • Total requests (allowed, blocked, rate-limited, pending)
    • Requests by status code
    • Current pending count
    • Current rate-limited count

5. WebUI Features

  • Authentication:
    • Admin-only access (no RBAC)
    • Secret-based login (secret provided as command-line argument)
    • Simple session cookie after successful login
  • Dashboard:
    • Key metrics (total requests, blocked, pending count, rate-limited count)
    • Statistics visualization (per-rule stats from stats.json)
  • Pending Requests Management:
    • Real-time table showing pending requests (SSE updates)
    • Deduplication: Single entry for identical requests, show "waiters count"
    • Countdown timer for each pending request
    • Actions: Add to whitelist, Add to blacklist, Ignore
    • Review timed-out/expired pending requests (historical view)
  • Rate-Limited Requests Viewer:
    • Separate htmx page with SSE updates
    • Show requests currently being held by rate control
    • Display: method, URL, matched rule, delay remaining, client IP
    • Auto-remove from view when request completes
  • Rule Management:
    • View whitelist.json / blacklist.json (read-only, user-managed)
    • View/edit whitelist2.json / blacklist2.json (runtime rules)
    • Add/edit/delete rules in whitelist2/blacklist2
    • Rule format: method (glob), pattern (URL glob), rpm (optional rate limit req/min)
    • Rules take effect immediately (no restart)
  • Access Log Viewer:
    • Paginated text viewer (all rotated log files)
    • Simple search/filter (grep-like)
    • Tail mode (show last N lines, auto-refresh)
  • Certificate Download:
    • Public endpoint /download-cert (no authentication required)
    • Returns CA certificate PEM file

6. Configuration

  • CLI Flags + Environment Variables (no config file):
    • Implementation: go-flags library provides struct-based configuration
    • Single struct definition with tags for both flags and env vars
    • --listen / AIPROXY_LISTEN - Proxy listen address (default: localhost:0)
    • --webui-listen / AIPROXY_WEBUI_LISTEN - WebUI listen address (default: "" = disabled; specify an address to enable WebUI in both daemon and wrapper modes)
    • --tls-cert / AIPROXY_TLS_CERT - TLS certificate path (optional, auto-generate if missing); relative paths are resolved to absolute at startup
    • --tls-key / AIPROXY_TLS_KEY - TLS key path (optional, auto-generate if missing); relative paths are resolved to absolute at startup
    • Combined file usage: Set both --tls-cert and --tls-key to the same path (e.g., --tls-cert ./certs/combined.pem --tls-key ./certs/combined.pem)
    • --admin-secret / AIPROXY_ADMIN_SECRET - Admin authentication secret (optional; WebUI login disabled if empty)
    • --blacklist-rules / AIPROXY_BLACKLIST_RULES - Blacklist rules file (default: rules/blacklist.json)
    • --whitelist-rules / AIPROXY_WHITELIST_RULES - Whitelist rules file (default: rules/whitelist.json)
    • --rt-blacklist-rules / AIPROXY_RT_BLACKLIST_RULES - Runtime blacklist rules file, WebUI-managed (default: data/blacklist2.json) (Phase 6 feature)
    • --rt-whitelist-rules / AIPROXY_RT_WHITELIST_RULES - Runtime whitelist rules file, WebUI-managed (default: data/whitelist2.json) (Phase 6 feature)
    • --pending-timeout / AIPROXY_PENDING_TIMEOUT - Pending request timeout (default: 120s)
    • --global-rate-limit / AIPROXY_GLOBAL_RATE_LIMIT - Global rate limit in req/min (default: 0 = unlimited)
    • --log-level / AIPROXY_LOG_LEVEL - Log level: debug, info, warn, error (default: info)
    • --log-file / AIPROXY_LOG_FILE - Log file path (empty = stdout) (default: "")
    • --log-max-size / AIPROXY_LOG_MAX_SIZE - Max log file size in MB before rotation (default: 10)
    • --log-max-age / AIPROXY_LOG_MAX_AGE - Max days to retain old log files, 0 = no limit (default: 0)
    • --log-max-backups / AIPROXY_LOG_MAX_BACKUPS - Max number of old log files to retain (default: 3)
    • --connection-timeout / AIPROXY_CONNECTION_TIMEOUT - Connection timeout (default: 30s)
    • --request-timeout / AIPROXY_REQUEST_TIMEOUT - Request timeout (default: 300s)
    • --insecure-certs / AIPROXY_INSECURE_CERTS - Allow insecure certificates (validation errors become warnings) (default: false)
  • Configuration Priority: Flags override environment variables
  • Validation: Required fields enforced, log-level choices validated by go-flags
  • Auto-generated help: --help flag automatically displays all options
  • No hot-reload: Configuration changes require restart
  • Streamable Request/Response Processing:
    • Use io.Copy and streaming for large payloads
    • Minimal memory buffering (no full request/response in memory)
    • No artificial size limits (rely on timeouts for protection)

7. Command Execution Wrapper

Purpose: Simplify testing and one-off proxy usage by wrapping commands with automatic proxy setup.

Syntax:

./aiproxy [proxy-flags] -- <command> [command-args]

Behavior:

  1. Parse everything after -- delimiter as command to execute
  2. Start proxy server in background and wait until ready (listener bound)
  3. Set environment variables for command (proxy URLs, CA cert paths)
  4. Execute command with proxy environment
  5. Forward stdin/stdout/stderr directly to/from command
  6. Log command start and completion with exit code
  7. When command exits, shut down proxy gracefully
  8. Exit with command's exit code

Environment Variables Set:

  • HTTP_PROXY, HTTPS_PROXY, http_proxy, https_proxy - Proxy address (e.g., http://localhost:12345)
  • SSL_CERT_FILE - CA certificate path (OpenSSL, curl, Ruby); always an absolute path
  • CURL_CA_BUNDLE - CA certificate path (curl); always an absolute path
  • REQUESTS_CA_BUNDLE - CA certificate path (Python requests library); always an absolute path
  • NODE_EXTRA_CA_CERTS - CA certificate path (Node.js); always an absolute path
  • All parent process environment variables are inherited

Examples:

# Wrap curl - simplest usage
./aiproxy -- curl https://api.github.com

# Wrap with proxy flags
./aiproxy --global-rate-limit 10 -- curl https://api.openai.com/v1/models

# Wrap git clone
./aiproxy -- git clone https://github.com/golang/go /tmp/test

# Wrap Python script
./aiproxy -- python3 my_script.py

# Wrap with complex command arguments
./aiproxy -- git log --since yesterday -- file.txt

# Without wrapper (daemon mode - current behavior)
./aiproxy  # No "--" delimiter

Exit Codes:

  • 0 - Command exited successfully (wrapper mode) OR daemon clean exit
  • N - Command's exit code (wrapper mode, where N is command's actual exit code)
  • 1 - Proxy runtime error OR command execution failure (command not found)
  • 2 - Configuration error OR empty command after -- delimiter

Mode Detection:

  • No -- delimiter in arguments: Daemon mode (foreground proxy server, existing behavior)
  • With -- delimiter: Wrapper mode (execute command, then exit)

Edge Cases:

  • Empty command after --: Configuration error, exit code 2
  • Command not found: Execution error, exit code 1
  • Proxy fails to start before command runs: Runtime error, exit code 1
  • Command with -- in its arguments: Works correctly (only first -- is delimiter)
  • IPv6 proxy address: Works correctly (brackets preserved in URL)
  • Long-running command: Proxy runs until command exits (by design)
  • Ctrl+C (SIGINT): Context cancellation propagates to both proxy and command

Implementation:

  • CLI parsing: Manual os.Args parsing before go-flags (detect -- delimiter)
  • Command execution: internal/runner package with Run() function
  • Main orchestration: Simple piping code in cmd/aiproxy/main.go
  • Context cancellation: Graceful proxy shutdown when command exits
  • Logging: Structured logging with slog for command lifecycle

Architecture Components

Request Flow (Detailed)

1. Client request arrives at proxy
2. TLS bumping (if HTTPS CONNECT to :443)
3. Extract: method, URL, headers, client IP
4. Check CONNECT METHOD to non-443 ports (anti-tunneling protection)
   └─> CONNECT to non-443 port? → Sleep 1s → Reject with HTTP 403 + JSON error + Log at WARN
   └─> CONNECT to :443? → TLS bump (MITM) → Continue to step 5
5. Check LOCALHOST IPs (DNS resolution)
   └─> Resolves to 127.0.0.0/8 or ::1? → Sleep 1s → Reject with HTTP 403 + JSON error + Log at ERROR
6. Check BLACKLIST (glob match)
   └─> Match? → Reject with HTTP 403 + JSON {"error": "forbidden", "reason": "blacklisted", "request_id": "req_N"} + Log at WARN
7. Check WHITELIST (glob match)
   └─> Match? → Check rate limit (interval-based) → Forward (streaming) → Log stats
8. Unknown request (not in whitelist/blacklist):
   └─> Add to pending queue (deduplicated by method+URL)
   └─> Hold connection silently for timeout (default 120s)
   └─> Notify WebUI via SSE (new pending request)
   └─> If approved by admin → Add to whitelist2 → Process request
   └─> If denied by admin → Add to blacklist2 → Reject request
   └─> If timeout expires → Reject with HTTP 403 + Keep in pending.json for review
9. Rate limiting (if rule has rpm):
   └─> Calculate interval: 60 / rpm seconds
   └─> Check time since last request to this rule
   └─> If interval not elapsed → Sleep for remaining time
   └─> Update last request timestamp
10. Forward allowed requests to upstream (streaming, no buffering)
11. Stream response back to client
12. Update stats.json, append to access.log

Storage Architecture

Rule File Structure

rules/                          # User-managed rule files (read-only by proxy)
  ├── whitelist.json      # User-defined whitelist rules (--whitelist-rules)
  └── blacklist.json      # User-defined blacklist rules (--blacklist-rules)

data/                          # Runtime state files (created by proxy)
  ├── whitelist2.json     # Runtime whitelist rules (--rt-whitelist-rules, via WebUI)
  ├── blacklist2.json     # Runtime blacklist rules (--rt-blacklist-rules, via WebUI)
  ├── pending.json        # Pending requests queue (created by proxy)
  └── stats.json          # Per-rule statistics (created by proxy)

TLS certificate files are stored at the configured `--tls-cert` and `--tls-key` paths.

File Schemas

whitelist.json / blacklist.json (--whitelist-rules / --blacklist-rules files):

  • Array of rule objects with the following fields:
    • id (string, required): Unique identifier within the file. Rules are sorted and matched by ID (lexicographic order).
    • comment (string, optional): Human-readable description, ignored during matching.
    • method (string, optional): HTTP method to match (e.g., "GET", "POST"). Omit to match any method.
    • scheme (string, optional): URL scheme to match (e.g., "http", "https"). Omit to match any scheme.
    • host (string, optional): Hostname glob pattern (e.g., "api.openai.com", "*.example.com"). Matches URL.Hostname() (no port). Omit to match any host.
    • path (string, optional): URL path glob pattern (e.g., "/v1/*", "/**"). Omit to match any path.
    • port (int, optional): Exact port to match. Mutually exclusive with port_range and port_ranges.
    • port_range ([2]int, optional): Port range [low, high] (inclusive). Both values must be > 0 and low ≤ high. Mutually exclusive with port and port_ranges.
    • port_ranges ([][2]int, optional): Array of port ranges. Each element follows same rules as port_range. Omit (not []) to match any port. Mutually exclusive with port and port_range.
    • rpm (int, optional): Per-rule rate limit in requests per minute. Field stored but enforcement is a future phase (v1: field accepted, not enforced).
  • All present (non-empty/non-zero) fields must match (AND logic); absent fields match anything.
  • Rules are sorted lexicographically by id before matching; file order does not affect behavior.
  • Missing file is not an error (proxy starts with empty rule set).
  • Example:
    [
      {"id": "allow-openai-chat", "comment": "Allow ChatGPT API", "method": "POST", "scheme": "https", "host": "api.openai.com", "path": "/v1/chat/**"},
      {"id": "allow-openai-get",  "method": "GET",  "scheme": "https", "host": "api.openai.com"},
      {"id": "block-admin",       "scheme": "https", "host": "*.example.com", "path": "/admin/**"}
    ]

blacklist.json (--blacklist-rules file):

  • Same object-array format as whitelist.json.

whitelist.json (--whitelist-rules file):

  • Same object-array format as whitelist.json described above.

whitelist2.json / blacklist2.json (--rt-whitelist-rules / --rt-blacklist-rules files):

  • Same schema as whitelist/blacklist
  • User can manually merge into static rule files when proxy stopped

pending.json (data/ directory):

  • Schema: {id, method, url, headers_sample, client_ip, timestamp, status, waiters_count}
  • Status: pending, expired, approved, denied
  • Deduplication: Single entry per unique method+URL, multiple waiters counted

stats.json (data/ directory):

  • Schema: {rule_id: {rule_pattern, count, first_seen, last_seen}}
  • Stats match rule granularity (not per unique URL)

access.log (TODO - access log feature not yet implemented):

  • Text format (Apache/Nginx-style), fixed hardcoded format
  • Format: {timestamp} {client_ip} {method} {url} {status} {response_time_ms} {action} {matched_rule_id}
  • Rotation: size-based, configurable via --log-max-size, --log-max-age, --log-max-backups

Rule Matching Engine

  • Uses internal/reqrules package for whitelist/blacklist rule storage and matching
  • Glob pattern matching via bmatcuk/doublestar (for host and path fields)
  • Whitelist and blacklist loaded into memory at startup
  • Runtime rules (whitelist2/blacklist2) merged with user rules
  • Rule reload: WebUI changes trigger in-memory rule update (no file re-read needed)
  • Rules stored and matched in lexicographic order by id
  • Match logic: all present (non-empty/non-zero) fields must match (AND); absent = wildcard
  • host field matches URL.Hostname() (no port); path matches URL.Path
  • Port matching: port (exact), port_range ([low,high] inclusive), port_ranges (list of ranges); at most one may be set

Rate Limiting Engine (Interval-Based)

  • Simple interval calculation per rule
  • Algorithm:
    interval_seconds = 60.0 / rpm
    time_since_last = now() - last_request_time[rule_id]
    if time_since_last < interval_seconds:
        sleep(interval_seconds - time_since_last)
    last_request_time[rule_id] = now()
    forward_request()
    
  • In-memory state: map[rule_id]last_request_timestamp
  • No token bucket, no sliding window - just simple intervals
  • State resets on proxy restart (acceptable)
  • Global rate limit applied if no per-rule limit specified
  • Track currently rate-limited requests for WebUI display

Pending Request Queue

  • In-memory queue + persistence to pending.json
  • Deduplication key: method + url (exact match)
  • Single entry for identical requests, track waiters count
  • Each unique pending request spawns one goroutine
  • Goroutine holds client connection, waits for:
    • Admin approval → Proceed with request
    • Admin denial → Return HTTP 403
    • Timeout expiration → Return HTTP 403, mark as expired in pending.json
  • SSE channel for WebUI notifications (new pending, approved, denied, expired)
  • Admin actions from WebUI:
    • Approve → Add rule to whitelist2.json → Process all waiting requests
    • Deny → Add rule to blacklist2.json → Reject all waiting requests
    • Ignore → Do nothing, let timeout handle it
  • Persist state changes immediately to pending.json

Security Considerations

  • WebUI authentication via admin secret (command-line arg)
  • Simple session cookie after login (httpOnly, secure)
  • Input validation for all WebUI inputs:
    • URL patterns (prevent glob DoS like **/**/**/**)
    • Rate limits (positive integers only)
    • Method patterns (limited charset)
  • CONNECT method protection (anti-tunneling):
    • CONNECT to non-443 ports blocked (prevents arbitrary TCP tunnel establishment)
    • CONNECT to port 443 allowed for HTTPS/TLS bumping (MITM inspection)
    • Logged at WARN level (security-relevant but less critical than SSRF)
    • 1-second delay before rejection (rate-limits scanning behavior)
    • Test-only disable flag (DisableConnectBlocking in proxy.Config)
    • No production bypass mechanism (secure by default, no CLI flag)
    • Implemented via conditional goproxy handlers (Not(ReqHostMatches(":443$")))
  • Localhost IP protection (SSRF prevention):
    • Blocks requests targeting 127.0.0.0/8 or ::1
    • Logged at ERROR level (potential SSRF attempt)
    • 1-second delay before rejection (rate-limits scanning behavior)
    • Test-only disable flag (DisableLocalhostBlocking in proxy.Config)
    • No production bypass mechanism (secure by default, no CLI flag)
  • Certificate private key permissions (0600)
  • Certificate validation (strict by default):
    • Expiration check (error if expired or not yet valid, warn if expiring within 30 days)
    • CA constraints validation (BasicConstraintsValid && IsCA must be true)
    • KeyUsage validation (must include KeyUsageCertSign)
    • Public/private key pair matching (verify keys correspond)
    • File permission checks (warn if private key is not 0600)
    • Optional --insecure-certs flag downgrades validation errors to warnings
  • Upstream HTTPS certificate validation:
    • Proxy validates upstream server certificates using system CA trust store
    • Invalid certificate response: HTTP 502 Bad Gateway with generic JSON error
    • Security constraint: NO certificate details exposed to client (prevents information disclosure to untrusted AI agents)
    • Certificate error details logged at ERROR level for operator debugging only
    • Validation failures include: expired certificates, invalid CA signature, hostname mismatch, revoked certificates (if CRL/OCSP available)
    • Client authentication: Not supported in v1 (no mTLS requirement for clients connecting to proxy)
  • No sensitive data in access logs (do not log full headers)
  • Error responses include minimal information:
    {
      "error": "forbidden",
      "reason": "not in whitelist",
      "request_id": "abc123"
    }
  • Secure defaults:
    • Default deny for unknown requests (pending → timeout → reject)
    • Reasonable timeouts (connection: 30s, request: 300s)
    • Streaming request/response (no memory exhaustion from large payloads)
    • Certificate download endpoint public (CA cert alone is not sensitive)

Container Deployment

  • Containerfile (Podman/Docker compatible)
  • Multi-stage build (build + runtime)
  • Volume mounts:
    • /rules - User rule files (whitelist.json, blacklist.json)
    • /data - Runtime state (whitelist2, blacklist2, pending, stats)
  • TLS certificate storage directory (path determined by --tls-cert/--tls-key; can be empty, certs auto-generated if missing)
  • Port exposure:
    • Proxy port (default 8080)
    • WebUI port (default 8081)
  • Command-line args passed to container:
    • --admin-secret (required, can be set via env var)
  • No graceful shutdown requirement (just must not crash)
  • Run as non-root user inside container

Project Structure

aiproxy/
├── cmd/
│   └── aiproxy/
│       └── main.go              # Entry point
├── internal/
│   ├── proxy/
│   │   ├── proxy.go             # goproxy setup, TLS bumping
│   │   └── handler.go           # Request handling logic, streaming
│   ├── rules/
│   │   ├── matcher.go           # Glob matching engine
│   │   ├── whitelist.go         # Whitelist management
│   │   ├── blacklist.go         # Blacklist management
│   │   └── loader.go            # Load rules from JSON (whitelist + whitelist2, etc.)
│   ├── pending/
│   │   ├── queue.go             # Pending request queue, deduplication
│   │   └── persistence.go       # pending.json I/O
│   ├── ratelimit/
│   │   └── limiter.go           # Interval-based rate limiting
│   ├── stats/
│   │   ├── collector.go         # Statistics collection (per-rule)
│   │   ├── persistence.go       # stats.json I/O
│   │   └── accesslog.go         # Access log rotation (text format)
│   ├── certs/
│   │   └── manager.go           # Certificate generation/loading
│   ├── config/
│   │   └── config.go            # Configuration loading (CLI flags + env vars, `--` parsing)
│   ├── runner/
│   │   └── runner.go            # Command execution wrapper (Run, buildEnvironment)
│   └── webui/
│       ├── server.go            # HTTP server, SSE endpoints
│       ├── auth.go              # Authentication middleware (session cookies)
│       ├── handlers/
│       │   ├── dashboard.go     # Dashboard page
│       │   ├── pending.go       # Pending requests API + SSE
│       │   ├── ratelimit.go     # Rate-limited requests viewer + SSE
│       │   ├── rules.go         # Rule management API
│       │   ├── logs.go          # Access log viewer
│       │   └── certs.go         # Certificate download (public)
│       ├── static/              # Embedded static assets (embed.FS)
│       │   ├── htmx.min.js      # htmx v4.0.0-beta1 (vendored)
│       │   ├── hx-sse.min.js    # htmx v4 SSE extension (vendored)
│       │   └── pico.min.css     # Pico CSS (vendored)
│       └── templates/           # templ files
│           ├── layout.templ
│           ├── dashboard.templ
│           ├── pending.templ
│           ├── ratelimit.templ
│           └── rules.templ
├── rules/                       # Volume mount (not in git)
│   ├── whitelist.json           # User-managed
│   └── blacklist.json           # User-managed
├── data/                        # Volume mount (not in git)
│   ├── whitelist2.json          # Proxy-managed (runtime rules)
│   ├── blacklist2.json          # Proxy-managed (runtime rules)
│   ├── pending.json             # Proxy-managed
│   └── stats.json               # Proxy-managed
├── certs/                       # Example certificate directory (not in git)
│   ├── ca-cert.pem
│   └── ca-key.pem
├── scripts/                     # Manual testing scripts
│   └── manual_cert_tests.sh
├── Containerfile
├── go.mod
├── go.sum
├── IDEA.md
├── TODO.md
└── README.md

Implementation Plan (Suggested Order)

Phase 1: Core Proxy

  1. Project setup (go.mod, directory structure)
  2. Configuration loading (CLI flags + environment variables)
  3. Certificate management (generation + loading)
  4. Basic goproxy setup with TLS bumping
  5. Request logging (slog)

Phase 2: Rule Engine

  1. Glob pattern matching
  2. Whitelist/blacklist loading (JSON)
  3. Request filtering (blacklist → whitelist → unknown)
  4. HTTP 403 responses for blocked requests

Phase 3: Pending Queue

  1. Pending request queue (in-memory)
  2. Request holding mechanism (goroutines + timeout)
  3. Deduplication logic
  4. Persistence (pending.json)

Phase 4: Rate Limiting

  1. Interval-based rate limiter (simple sleep/hold implementation)
  2. Global rate limit
  3. Per-rule rate limit
  4. Request delay logic

Phase 5: Statistics & Logging

  1. Stats collector (per-rule, not per-URL)
  2. Stats persistence (stats.json)
  3. Access log writer (Apache/Nginx-style text format)
  4. Access log rotation (configurable lines and file count)

Phase 6: WebUI

  1. Basic HTTP server + authentication (session cookies)
  2. SSE infrastructure for real-time updates
  3. templ setup + htmx integration
  4. Dashboard (read-only stats display)
  5. Pending requests viewer (real-time via SSE, deduplication display)
  6. Rate-limited requests viewer (separate page, real-time via SSE)
  7. Rule management UI (add/edit/delete whitelist2/blacklist2)
  8. Access log viewer (tail mode, search/filter)
  9. Certificate download endpoint (public, no auth)

Phase 7: Container & Documentation

  1. Containerfile (multi-stage build)
  2. Example config.json, whitelist.json, blacklist.json
  3. Integration testing (manual testing workflow)
  4. Documentation (README.md with usage examples)

Dependencies (Estimated)

  • github.com/elazarl/goproxy - HTTP/HTTPS proxy with MITM support
  • github.com/bmatcuk/doublestar/v4 - Glob pattern matching (supports **)
  • github.com/a-h/templ - Type-safe HTML templates
  • Standard library: net/http, log/slog, encoding/json, crypto/tls, crypto/x509, io
  • No external rate limiting library (simple interval-based implementation)
  • No external database (JSON files only)

Design Decisions (Finalized)

Authentication & Security

  1. Admin Secret: Provided via --admin-secret command-line argument (optional)
    • If empty/not provided: WebUI login disabled (authentication always fails)
    • Warning logged at startup when admin secret is empty
    • Certificate download endpoint remains public (no auth required)
    • Secure by default: no secret = no admin access
  2. WebUI Auth: Simple session cookie after login, no complex RBAC
  3. Certificate Download: Public endpoint (no auth), CA cert alone is not sensitive
  4. Error Responses: Minimal JSON format to simplify testing and debugging

Access Control

  1. Pending Deduplication: Single entry for identical requests, multiple waiters counted
  2. Hold Behavior: Hold connection silently (no feedback to client until timeout/approval)
  3. Default Policy: Deny unknown requests after timeout (secure by default)
  4. Rule Matching: Glob patterns for host and path fields; exact match for method, scheme, port; all present fields ANDed

Rate Limiting

  1. Algorithm: Simple interval-based (60 / rpm = seconds between requests)
  2. Behavior: Hold/sleep request for remaining interval time (no rejection)
  3. Scope: Per-rule (not per-client or per-URL)
  4. Middleware Architecture: Native goproxy chaining via OnRequest().DoFunc() — each rate limiter is a separate middleware registered in order. If global rate limit is 0, the middleware is not registered at all.
  5. Delayed Request Entity: DelayedRequest struct holds *http.Request directly (no field duplication), RequestID (dedicated uint64 type with req_N string format), Delay (hold duration), and Status (dedicated enum type with String()). Shared DelayedRequestStore is used by both global and per-rule rate limiters.
  6. Delayed Request Logging: When a delayed request is sent after the sleep, a separate log entry is generated with message "Delayed request sent" containing the same fields as the initial request log plus an additional delay parameter. The initial request log is unchanged.

Statistics & Logging

  1. Stats Granularity: Per-rule (matches rule definitions, not unique URLs)
  2. Access Log Format: Fixed Apache/Nginx-style text format (not JSON)
  3. Log Rotation: Size-based via lumberjack (gopkg.in/natefinch/lumberjack.v2), configurable max size (MB), max age (days), max backups

WebUI Real-Time Updates

  1. Technology: Server-Sent Events (SSE) for pending/rate-limited request updates
  2. Efficiency: SSE works well with htmx, provides instant updates

Phase 6 – Dashboard (Step 1)

  1. Public Dashboard: /, /api/dashboard/stream, and /download-cert are public endpoints (no authentication required). They expose only safe operational status — no access-control configuration, no whitelist/blacklist rules, no rule counts.
  2. Dashboard Content: Uptime, CA cert subject + expiry + download link, live counters (total processed requests, pending requests, rate-limited requests). Global rate-limit setting shown as static info. Nothing that reveals the proxy's access-control configuration.
  3. WebUI Stack: templ (type-safe HTML templates) + htmx v4.0.0-beta1 + Pico CSS. All static assets vendored under internal/webui/static/ and embedded into the binary via embed.FS (//go:embed static) — no external files required at runtime (single self-contained binary).
  4. htmx v4 SSE Pattern: Extension registers globally on script load (no hx-ext="sse" needed). Connect with hx-sse:connect="/url". Unnamed SSE messages (no event: line in server response) auto-swap the element's content via hx-swap. Named events are dispatched as DOM events, not auto-swapped.
  5. SSE Push Interval: 1 seconds. On client disconnect the server goroutine exits via r.Context().Done().
  6. Static Asset Serving: http.FileServerFS(staticFiles) (Go 1.22+) serves the embedded FS under /static/. No fs.Sub needed; embed.FS path stripping is handled by the stdlib handler.
  7. ProxyMetrics Interface: Defined in internal/webui/server.go (consumer side — idiomatic Go "accept interfaces, return structs"). *proxy.Proxy satisfies it via three new methods: RequestCount() uint64, RateLimitedCount() int, PendingCount() int.
  8. Cert Download Safety: /download-cert re-encodes *x509.Certificate to PEM from memory (never reads a file). This guarantees the endpoint never accidentally serves a combined cert+key file even if the proxy was started with one.
  9. Merged Status Block: The two previously separate dashboard blocks ("Status" and "Live Stats") are merged into a single "Status" block. The "● Running" badge is removed: the WebUI is not detachable from the proxy process, so if the page is reachable the proxy is always running — the badge is always true and carries no information. The StatsFragment template is renamed StatusFragment and extended to include live uptime as the first row of the <dl> alongside the request counters. The SSE push interval is 1 seconds for responsive uptime and counter updates. The SSE endpoint pushes the full status (uptime + counters) on every tick, keeping the entire block live-updated. The #live-stats SSE target element is unchanged; only the fragment content expands.

Phase 6 – Auth (Step 2: Login/Logout)

  1. Login Form: Password Only: The /login page renders a single <input type="password"> field with label "Admin password" and a submit button. No username field. Page title: "Login".
  2. 1-Second Minimum Response Time: Every POST /login handler records start := time.Now() as its first statement. Before writing any response it calls time.Sleep(time.Until(start.Add(time.Second))). Applies to all outcomes: wrong password, correct password, disabled auth. A timing oracle is impossible.
  3. Constant-Time Secret Comparison: Password compared using crypto/subtle.ConstantTimeCompare([]byte(submitted), []byte(secret)) == 1. Length differences that would short-circuit ConstantTimeCompare are mitigated by the hard 1-second floor (Decision 38).
  4. Session Token Format: 32 bytes from crypto/rand, encoded as lowercase hex (64 chars). Stored in cookie named aiproxy_session with flags: HttpOnly, SameSite=Strict, Path=/. Secure flag is not set (WebUI is container-local, typically accessed over plain HTTP). MaxAge = 86400 (24 hours).
  5. Single-Session Store: Only one session may be active at any time. SessionStore holds a single *Session (nil when no session) protected by sync.RWMutex. Create() atomically replaces any existing session — a new login immediately invalidates the previous session. Validate() returns a ValidateResult enum with three values: SessionValid, SessionInvalid (expired, no session, or server restart), SessionKicked (a different non-expired session is currently active — intrusion signal). SessionKicked is only returned when the existing session is still active but has a different token, allowing the UI to display "Session expired or logged out from another location." Sessions are not persisted — a restart invalidates all sessions. No background goroutines or cleanup needed. API: Create() (string, error), Validate(token string) ValidateResult, Delete(token string).
  6. Auth Middleware: authMiddleware(store *SessionStore, next http.Handler) http.Handler. Reads aiproxy_session cookie; calls store.Validate(token). On SessionValid: call next handler. On SessionKicked: redirect to /login?msg=kicked (intrusion signal — a different session is active). On SessionInvalid: redirect to /login. No ?next= parameter — successful login always redirects to /.
  7. Public vs Protected Routes:
    • Public (no auth): GET /, GET /api/dashboard/stream, GET /download-cert, GET /static/, GET /login, POST /login
    • Protected (auth required): GET /logout, GET /pending, GET /api/pending/stream
  8. Empty Admin Secret Behaviour: When --admin-secret is empty, POST /login always returns failure after the mandatory 1-second delay with message "Authentication disabled — no admin secret configured". GET /login shows a notice: "Admin access is disabled. Start the proxy with --admin-secret to enable login."
  9. Login Redirect: Successful POST /login always redirects to /. No ?next parameter — the redirect target is unconditional.
  10. Logout: GET /logout (wrapped by authMiddleware) removes token from session store, clears cookie (MaxAge=-1), redirects to /.
  11. Navigation Bar: Base layout signature changes from Base(title string) to Base(title string, nav NavData) where NavData is struct { IsAuthenticated bool; AuthEnabled bool } (AuthEnabled = --admin-secret != ""). Nav links in <nav> inside <header>: Dashboard (→ /, always shown), Pending (→ /pending, always shown), Login (→ /login, shown when !IsAuthenticated && AuthEnabled), Logout (→ /logout, shown when IsAuthenticated).

Phase 6 – Pending View (Step 3)

  1. Pending Page Route: GET /pending (auth required). Full page using Base layout, title "Pending Requests". Contains a <table> with <thead> and <tbody> as the SSE live-update target.
  2. PendingSource Interface: Defined in internal/webui/handlers/pending.go (consumer side). handlers imports internal/pending directly — no import cycle exists (pending → nothing; proxypending; handlerspending; mainproxy, webui). The interface uses *pending.Entry directly, eliminating a duplicate DTO type:
    import "github.com/cloudcopper/aiproxy/internal/pending"
    
    type PendingSource interface {
        PendingItems() []*pending.Entry
    }
    *proxy.Proxy satisfies this interface via its PendingItems() []*pending.Entry method. No intermediate DTO struct or adapter is needed.
  3. Pending Table Columns: Method | URL | Waiters | Elapsed | Remaining. Elapsed = time.Since(item.Since) rounded to seconds. Remaining = item.Timeout - time.Since(item.Since), show "expired" when ≤ 0. No action buttons in this phase — read-only display.
  4. Pending SSE Pattern: GET /api/pending/stream (auth required). Same tick-and-push pattern as dashboard SSE: 1-second interval, push full <tbody> inner HTML as an unnamed SSE event. Target: <tbody id="pending-rows" hx-sse:connect="/api/pending/stream" hx-swap="innerHTML">. When PendingItems() returns empty: renders single <tr><td colspan="5">No pending requests</td></tr>.
  5. New Files for Auth and Pending View:
    • internal/webui/auth/session.goSessionStore and its methods
    • internal/webui/handlers/login.goLoginHandler, LogoutHandler, authMiddleware
    • internal/webui/handlers/pending.goPendingPageHandler, PendingSSEHandler
    • internal/webui/templates/login.templ — login page template
    • internal/webui/templates/pending.templ — pending page + tbody fragment templates
    • internal/webui/templates/layout.templ — updated to accept NavData parameter
    • internal/webui/server.go — updated to wire new routes, auth middleware, AdminSecret and PendingSource added to ServerConfig

Phase 3 – Pending Queue

  1. Phase 3 Scope (Timeout-Only, No WebUI, No Persistence): Phase 3 implements the hold-and-timeout path only. Admin approval/denial (adding to whitelist2/blacklist2), SSE notifications, and persistence to pending.json are deferred to later phases. Unknown requests — those not matched by the blacklist and not matched by the whitelist — are held silently in memory until --pending-timeout (default 120s) expires, then rejected with HTTP 403. The pending queue is always created regardless of whitelist configuration or --pending-timeout value. --pending-timeout 0 means immediate rejection (zero wait) — NOT pass-through. There is no configuration that makes the proxy pass unclassified requests to the upstream; unclassified always means pending → rejected. State is lost on proxy restart (acceptable for Phase 3).

  2. internal/pending Package API:

    • EntryStatus enum starts at 1 (zero value = invalid/unknown): StatusPending = 1, StatusExpired = 2
    • Entry struct exported fields: ID string, Method string, URL string, Since time.Time, Timeout time.Duration; unexported: done chan struct{} (closed when resolved). No ClientIP — the entry represents N deduplicated callers from potentially N different IPs; storing any one IP would be arbitrary and misleading.
    • Queue struct: mu sync.Mutex (protects active), timeout time.Duration, active map[string]*Entry (key = dedup key), nextID atomic.Uint64
    • NewQueue(timeout time.Duration) *Queue — creates an empty in-memory queue; no I/O
    • Queue.Hold(ctx context.Context, method, url string) bool — blocks until entry is resolved or client disconnects; always returns false; caller must issue 403
    • Queue.ActiveCount() int — length of active map, thread-safe
    • Queue.ActiveEntries() []*Entry — snapshot copy of active entries (for Phase 6 WebUI)
  3. Deduplication Key: method + " " + url (single space, exact match, no URL normalization). Two requests are considered identical when their method and URL string are equal.

  4. Hold() Concurrency Model:

    • Acquire q.mu; look up entry by key (method + " " + url); if absent, create a new Entry (allocate done channel, set Since = time.Now(), generate ID, add to active map, start timeout goroutine — goroutine started while holding the lock so it cannot fire before the entry is published); release q.mu
    • Caller goroutine blocks: select { case <-entry.done: return false; case <-ctx.Done(): return false }
    • On ctx.Done() (client disconnected): simply return false; the timeout goroutine continues running — the entry stays alive until its deadline, allowing subsequent identical requests to join the same entry
    • Timeout goroutine uses time.NewTimer(q.timeout)never time.After (avoids timer leak)
    • On timeout: acquire q.mu; delete(q.active, key); release lock; then close entry.done (outside the lock — never hold mutex while unblocking waiters)
    • All waiters on the same entry unblock simultaneously when entry.done is closed
  5. Entry ID Format: "pnd_N" where N is a sequential uint64 from q.nextID.Add(1), consistent with the RequestID "req_N" style. ID assigned once when entry is first created.

  6. Proxy Integration:

    • Add queue *pending.Queue to Proxy struct (nil when queue disabled)
    • Add PendingTimeout time.Duration to proxy.Config (no file path — no persistence this phase)
    • NewProxy() queue-and-whitelist wiring:
      1. Always create queue — p.queue = pending.NewQueue(cfg.PendingTimeout)PendingTimeout == 0 means immediate rejection, not disabled.
      2. If whitelist.Count() > 0: register allowWhitelist handler (match → forward, no match → holdPending).
      3. If whitelist.Count() == 0: register holdPending directly as a catch-all — every non-blacklisted request is unclassified and goes straight to the pending queue. There is no code path that passes an unclassified request to the upstream. "not_whitelisted" reason string is removed; all pending-queue rejections use "blacklisted".
    • New file internal/proxy/pending.go: holdPending(req *http.Request, ctx *goproxy.ProxyCtx) (*http.Request, *http.Response) calls p.queue.Hold(req.Context(), req.Method, req.URL.String()) (blocks), then logs at WARN with reason: "pending_timeout" slog attribute and returns errorResponse(req, 403, "forbidden", "blacklisted", id)identical error body to blacklist rejection
    • Proxy.PendingCount(): returns p.queue.ActiveCount() when p.queue != nil, else 0 (replaces stub)
    • Proxy.PendingItems(): converts p.queue.ActiveEntries() to []handlers.PendingItem when p.queue != nil, else nil (replaces stub)
  7. main.go Wiring: pass cfg.PendingTimeout into proxy.Config.PendingTimeout. No new CLI flags or config fields needed — --pending-timeout / AIPROXY_PENDING_TIMEOUT already exists in config.Config.

  8. Timeout Error Response: HTTP 403 with JSON body {"error": "forbidden", "reason": "blacklisted", "request_id": "req_N"}identical to the blacklist rejection response. From the client's perspective, an expired pending request is indistinguishable from a blacklisted one. The slog WARN log entry uses "reason": "pending_timeout" as a structured attribute so operators can distinguish the two cases in logs.

Phase 3 – WaitersCount & WebUI Wiring (Step 4)

  1. WaitersCount Tracking in Entry: Add unexported waiters atomic.Int64 to Entry. Export via Entry.Waiters() int (returns int(e.waiters.Load())). In Hold(), immediately after the lock is released and before the blocking select: call entry.waiters.Add(1) and defer entry.waiters.Add(-1). The decrement fires atomically when Hold() returns for any reason (timeout or context cancellation). This gives an accurate real-time count of goroutines currently blocked in Hold() for a given entry.

  2. Proxy.PendingItems() Method: *proxy.Proxy exposes PendingItems() []*pending.Entry in internal/proxy/proxy.go alongside the existing PendingCount() method. Returns p.queue.ActiveEntries() when p.queue != nil, else nil. Return type is []*pending.Entry — a type the proxy package already owns — so proxy has no dependency on webui packages. *proxy.Proxy structurally satisfies handlers.PendingSource without any adapter.

  3. Direct wiring in main.go: webuiCfg.Pending = proxyServer*proxy.Proxy directly satisfies handlers.PendingSource via structural typing. main.go does not import internal/webui/handlers. The assignment itself acts as a compile-time interface satisfaction check: the build fails there if *proxy.Proxy no longer implements the interface. No pendingAdapter type.

  4. Tests for WaitersCount: Add to internal/pending/queue_hold_test.go:

    • TestHold_waitersCount: single goroutine → Waiters() is 1 while blocked, two goroutines → Waiters() is 2; both drop to 0 after timeout.
    • TestHold_waitersCountDecrement_onContextCancel: two goroutines on same entry, cancel one context → Waiters() drops from 2 to 1; entry stays alive; after timeout → 0. Both tests use synctest.Test for deterministic timer behavior.
  5. Testing:

    • Unit tests in internal/pending/ (same package, white-box):
      • queue_hold_test.go: Hold() returns false after timeout; Hold() returns false on context cancel; dedup — two concurrent goroutines with identical method+URL share one Entry and both unblock simultaneously on timeout
    • Use testing/synctest (Go 1.25 synctest.Test) for deterministic timer-based tests
    • Use goleak.VerifyTestMain to detect goroutine leaks in the pending package
    • Integration test internal/integration_tests/proxy_pending_test.go (build tag integration): start real proxy with a whitelist rule that does NOT match the test request; verify response is HTTP 403 with {"error":"forbidden","reason":"blacklisted",...} after the configured timeout; verify dedup (two concurrent clients, same request, share one queue entry, both receive 403)

Pending Queue Re-evaluation on Rule Change

D-REEVALUATE-1. Problem: When a rule is added to the whitelist or blacklist at runtime (via WebUI or p.Whitelist().Add() / p.Blacklist().Add()), pending requests that match the new rule continue waiting in the queue until their timeout fires. This is wrong: a whitelist addition should immediately forward matching pending requests; a blacklist addition should immediately reject them.

D-REEVALUATE-2. Resolution type in pending package: Hold() return type changes from bool (always false) to Resolution, a named type with four values:

  • ResolutionTimeout — timer fired (zero value)
  • ResolutionApproved — resolved via whitelist match
  • ResolutionDenied — resolved via blacklist match
  • ResolutionDisconnected — client context cancelled

D-REEVALUATE-3. Entry.resolution atomic.Int32 field: Set before closing entry.done. runTimeout stores ResolutionTimeout; Resolve() stores the caller-provided value. Hold() reads it after <-entry.done and returns it.

D-REEVALUATE-4. Queue.Resolve(method, url string, r Resolution) bool: Acquires lock, finds entry by key, deletes from active map, releases lock, stores resolution, closes entry.done. Returns false if entry not found (already resolved or never existed). Race-safe: runTimeout also deletes from the active map under the lock first — whichever of Resolve or runTimeout runs first wins the deletion; the other finds the key absent and becomes a no-op. The done channel is never closed twice.

D-REEVALUATE-5. runTimeout race fix: Before closing done, runTimeout must verify the entry is still in the active map (i.e., has not been resolved by Resolve). Pattern:

q.mu.Lock()
_, stillActive := q.active[key]
if stillActive {
    delete(q.active, key)
}
q.mu.Unlock()
if stillActive {
    entry.resolution.Store(int32(ResolutionTimeout))
    close(entry.done)
}

D-REEVALUATE-6. holdPending change: After q.Hold() returns, switch on the Resolution:

  • ResolutionApproved → log INFO "pending request approved", return (req, nil) (goproxy forwards to upstream)
  • ResolutionDenied or ResolutionTimeout → log WARN, return (req, errorResponse(403, "forbidden", "blacklisted", id))
  • ResolutionDisconnected → log DEBUG "pending request client disconnected", return (req, errorResponse(403, "forbidden", "blacklisted", id))

D-REEVALUATE-7. Proxy.ReevaluatePending() method: Iterates p.queue.ActiveEntries(), checks each entry against the live blacklist (first) and whitelist (second), calls p.queue.Resolve() for entries that match. Entries matched by blacklist → ResolutionDenied; entries matched by whitelist → ResolutionApproved; unmatched entries → left alone. This method is called by the WebUI after adding a rule to either list. To construct an *http.Request for matching, use http.NewRequest(e.Method, e.URL, nil) (ignore the error — URL was already parsed when the original request arrived).

D-REEVALUATE-8. Caller responsibility: Proxy.ReevaluatePending() must be called after any rule addition (whitelist or blacklist). In the WebUI, the add-rule HTTP handler calls p.ReevaluatePending() after store.Add(rule). In integration tests, the test calls it directly after adding a rule to the store. There is no automatic/background re-evaluation loop — re-evaluation is explicitly triggered.

D-REEVALUATE-9. Unit tests in internal/pending/: Add test TestResolve_approved and TestResolve_denied (using synctest.Test) to verify that Resolve unblocks waiters immediately and Hold returns the correct resolution. Add TestResolve_notFound verifying Resolve returns false for an unknown key. Update all existing tests that check Hold() == false to compare against ResolutionTimeout (or keep as-is if the Resolution type has a helper method).

D-REEVALUATE-10. Integration tests in internal/integration_tests/proxy_pending_reevaluate_test.go:

  • TestProxy_PendingApprovedByWhitelist: no-rules proxy, client request held in pending, add whitelist rule + call ReevaluatePending(), request completes with HTTP 200, pending queue is empty.
  • TestProxy_PendingDeniedByBlacklist: no-rules proxy, client request held in pending, add blacklist rule + call ReevaluatePending(), request completes with HTTP 403 (reason: "blacklisted"), pending queue is empty.

Phase 6 – Rules UI

  1. RulesSource Interface: Defined in internal/webui/handlers/rules.go (consumer side). Returns the live merged stores directly so handlers can share a single implementation for both whitelist and blacklist:

    type RulesSource interface {
        Whitelist() *reqrules.ReqRules
        Blacklist() *reqrules.ReqRules
    }

    *proxy.Proxy satisfies this via two new methods Whitelist() and Blacklist() that return p.whitelist and p.blacklist. Defined in internal/proxy/proxy.go. webui.ServerConfig gains a Rules RulesSource field (nil → nullRulesSource{} stub with empty stores).

  2. No Persistence (in-memory only): Add/delete operations mutate the live *reqrules.ReqRules in memory only. Changes are lost on restart. Persistence (rules.Save) is deferred to TODO. Static rule files (whitelist.json, blacklist.json) are never modified by the WebUI.

  3. Single /rules Page, Two Sections: One page with two <article> blocks rendered in processing order — Blacklist Rules first, Whitelist Rules second — matching the proxy request flow (blacklist checked before whitelist). Routes: GET /rules (full page, auth required), POST /api/rules/whitelist, DELETE /api/rules/whitelist/{id}, POST /api/rules/blacklist, DELETE /api/rules/blacklist/{id} (all auth required).

  4. htmx Partial Tbody Swaps (No SSE): Rules change only on explicit admin action — no SSE needed. Each table section has a <tbody id="{listType}-rows">. Add and delete handlers return a fresh <tbody> fragment (hx-swap="outerHTML"). The fragment always includes all current rule rows plus the empty add-form row.

  5. User-Provided Rule ID (Mandatory): The add-form row has an id text input that starts empty. Validation: non-empty, passes reqrules.Rule.Validate(), no duplicate ID in the store. Server returns HTTP 422 with the tbody fragment (form row pre-populated with submitted values + inline error message) when validation fails.

  6. Add-Form Row Inside <tbody>: The add-row <tr id="{listType}-add-row"> lives inside the <tbody>. It contains the add form with fields: id (text, required), method (select: blank + GET/POST/PUT/DELETE/PATCH/HEAD/OPTIONS), scheme (select: blank + http/https), host (text), path (text), comment (text). Blank selects = omit field (match any). Submit button label: "Add". The form posts to /api/rules/{listType} with hx-target="#{listType}-rows" hx-swap="outerHTML".

  7. Vanilla JS Row Repositioning: A single <script> block on the rules page attaches a delegated input listener to document. When the id input inside an add-row changes, the listener moves the <tr> to its correct sorted position among rows with data-rule-id attributes. If the typed ID is empty, the add-row stays at the bottom. After htmx swaps the tbody (add/delete), the add-row is reset (included fresh in the returned fragment) and the listener automatically applies to the new DOM because it is delegated on document.

  8. Delete with hx-confirm: Runtime rule rows (Rule.Runtime == true) render a Delete button with hx-delete="/api/rules/{listType}/{id}" hx-confirm="Delete rule '{id}'?" hx-target="#{listType}-rows" hx-swap="outerHTML". Static rule rows (Rule.Runtime == false) render a read-only badge ("static") instead of a delete button — they cannot be deleted via WebUI.

  9. Rules Nav Link: <li><a href="/rules">Rules</a></li> added to the nav bar between "Pending" and the Login/Logout block. Always shown (same as Dashboard and Pending). Clicking while unauthenticated redirects to /login?next=/rules.

  10. New Files for Rules UI:

    • internal/webui/handlers/rules.goRulesSource interface, RulesConfig, NewRulesPageHandler, NewRulesAddHandler, NewRulesDeleteHandler; shared rulesTableBodyData(store *reqrules.ReqRules) []RuleRowData helper used by both whitelist and blacklist handlers
    • internal/webui/templates/rules.templRulesPage(data RulesPageData), RulesSectionFragment(section RulesSectionData) (returns <tbody>), RuleRow(r RuleRowData, listType string), RulesAddRow(listType string, vals RuleFormValues, errMsg string)
    • internal/webui/templates/layout.templ — add Rules nav link
    • internal/webui/server.go — add Rules RulesSource to ServerConfig; register 5 new routes; wire nullRulesSource{} fallback
    • internal/proxy/proxy.go — add Whitelist() *reqrules.ReqRules and Blacklist() *reqrules.ReqRules methods
    • cmd/aiproxy/main.gowebuiCfg.Rules = proxyServer
  11. Rule.Priority int Field: Added to reqrules.Rule with json:"priority,omitempty". Default 0. Validation: priority >= 0 (negative values rejected). Sort order in ReqRules changes from (id ASC) to (priority ASC, id ASC) — lower number = higher priority = checked first; ties broken by ID lexicographic order. Backward compatible: existing JSON files without priority field get 0, preserving current ID-only ordering for all rules that do not set it. The field appears in the add-rule form (number input, min 0, default 0) and in the rule table as a column. The inline JS repositioning for the add-row updates to sort by (data-rule-priority ASC, data-rule-id ASC) and fires on changes to either the ID input or the priority input (both carry data-reposition).

  12. Edit Rule UI — in-place JS transform: Runtime rule rows support in-place editing with no /edit endpoint — only PUT /api/rules/{listType}/{id}. Each runtime <tr> carries all field values as data-rule-* attributes (data-rule-priority, data-rule-method, data-rule-scheme, data-rule-host, data-rule-path, data-rule-comment, data-list-type). Entering edit mode: startEdit(btn) caches the actions cell HTML in tr.dataset.savedActions, replaces each <td> content with an appropriate input/select pre-filled from data-*, and adds class .editing. Mutual exclusion: startEdit cancels any open .editing row first. Cancelling: cancelEdit(btn) restores cells from data-* attributes, restores actions cell from savedActions, calls htmx.process(cells[7]) to re-register delete-button htmx attributes. Saving: saveEdit(btn) gathers all named inputs via URLSearchParams, calls fetch() PUT. 200 OK → server returns full <tbody> (correct sort order); JS does tbody.outerHTML = html then htmx.process(newTbody). 422 → server returns plain-text error; JS appends inline <small class="edit-error"> to the ID cell (always in viewport). All edit JS (delegated click listener, startEdit, cancelEdit, saveEdit, helpers sel, esc) lives in one <script> block in RulesPage. ID is read-only during edit. New routes: PUT /api/rules/whitelist/{id} and PUT /api/rules/blacklist/{id} (auth required).

Streaming & Memory

  1. Request/Response Processing: Streaming with io.Copy, minimal buffering
  2. Size Limits: None (rely on timeouts for protection, not artificial limits)

Files at Startup

  1. Missing Files: Not critical - proxy continues without user-managed JSON files (secure default: deny all)
  2. Auto-Generation: Only certificate files auto-generated if missing

CLI Configuration

  1. Library Choice: go-flags over Cobra+Viper for minimal dependencies
  2. Rationale: See BEST_PRACTICES.md "CLI Configuration Library Choice" section for detailed rationale and trade-offs
  3. --version flag: Handled in init() in cmd/aiproxy/version.go before go-flags runs. init() scans os.Args for --version, prints version info, and calls os.Exit(0). A dummy Version bool field with long:"version" is added to the Config struct solely so go-flags includes --version in --help output — it is never actually read. Build-time values (Version, Commit, BuildDate) are injected via -ldflags by make build; defaults are "dev"/"unknown" for plain go build.

Feature #2 – Runtime Rules Load & Merge

D1. Rule.Runtime bool: Exported field, tagged json:"-" (never serialized to file). false on static rules; true on runtime rules. Used by the WebUI to render editable vs read-only rows — caller iterates via Range and reads r.Runtime directly. No new methods needed on ReqRules for this purpose.

D2. Single Load function: LoadWhitelist and LoadBlacklist are identical implementations. Both are replaced by a single rules.Load(filePath string, opts ...LoadOption) (*reqrules.ReqRules, error). Call-site intent is expressed via the file path and options, not the function name. Matches the Save function signature pattern.

D3. WithRuntime() load option: Sets Runtime=true on every rule loaded from the file. Used to load whitelist2.json / blacklist2.json. Static rule loads pass no options (default Runtime=false).

D4. Single merged store: Static and runtime rules are merged into the same *reqrules.ReqRules instance. ReqRules.Add() already re-sorts by ID on every insertion, so lexicographic order across both sources is preserved automatically. The proxy's existing allowWhitelist and blockBlacklist handlers need no changes.

D5. Merge order — RO always wins: Static (read-only) rules are loaded after runtime rules. Because ReqRules.Add() is last-writer-wins by ID, static rules silently override any runtime rule with the same ID. This is the security guarantee: a compromised or misconfigured runtime file can never shadow a hardened static rule. It also supports the normal admin workflow where a runtime rule is copied (and optionally edited) into the static file — the static version takes effect on the next restart without needing to remove the rule from the runtime file first. An INFO is logged for each override so the admin knows a rule has been promoted and the runtime copy can be cleaned up.

D6. Save function: rules.Save(store *reqrules.ReqRules, filePath string, opts ...SaveOption) error. Collects rules from the store via Range, applies options, writes atomically (temp file in same directory + os.Rename). The directory must already exist (per DONT.md: never auto-create operational directories).

D7. WithRuntimeOnly() save option: Filters the Range output to only rules where r.Runtime == true. Used to persist runtime rule changes back to whitelist2.json / blacklist2.json without including static rules.

D8. Missing runtime file: Not an error — Load returns an empty store. Same policy as static rule files.

D9. Invalid runtime file: Fatal startup error — Load returns an error, caller logs and exits with code 1. Same policy as static rule files.

D10. No ReqRules method additions: Del is used directly for runtime rule deletion (the WebUI handler guards editability by checking rule.Runtime before calling Del — business logic stays in the handler, not in ReqRules). No DeleteRuntime, no RuntimeRules methods added to ReqRules.

D12. Load2 helper: Repeated load-static + load-runtime + merge pattern in main.go is extracted into rules.Load2(staticPath, rtPath string) (*reqrules.ReqRules, error). Load order: RT rules first, static rules second. Because ReqRules.Add() replaces by ID, static rules loaded second always win over any RT rule with the same ID (D5). An INFO is logged for each override. main.go no longer imports reqrules directly.

D11. No proxy changes: proxy.Config and proxy.Proxy are unchanged. The merged *reqrules.ReqRules is passed to NewProxy exactly as before. Runtime rule management (add/delete at runtime) is wired to the WebUI in feature #3.

Feature #2 – Runtime Rules Persistence (Write-back)

D-PERSIST-1. Filename on ReqRules: reqrules.ReqRules gains an unexported filename string field and two methods: SetFilename(path string) and Filename() string. These carry the backing file path as metadata — no file I/O lives in reqrules (the package stays pure in-memory; all I/O remains in the rules package, consistent with how loading already works). The mutex already present on ReqRules protects filename for safe concurrent reads via Filename().

D-PERSIST-2. Load2 sets the filename: After merging static and runtime rules, rules.Load2 calls store.SetFilename(rtPath) before returning. From that point on any caller with access to the store can persist runtime rules without knowing the file path — the path travels with the store, not through config structs.

D-PERSIST-3. Simplified Save signature: rules.Save(store *reqrules.ReqRules) error replaces the previous Save(store, filePath, opts...) form. The function reads store.Filename() internally and returns nil (no-op) when the filename is empty (e.g., stores created with reqrules.New() in tests). It always writes only Runtime==true rules — the dedicated write target is always the runtime file, never the static file. SaveOption, WithRuntimeOnly(), and saveConfig are removed.

D-PERSIST-4. Handler call sites: The three WebUI rule mutation handlers (Add, Edit, Delete in internal/webui/handlers/rules.go) call rules.Save(store) immediately after mutating the in-memory store. A save failure is non-fatal: the in-memory change is already live and the proxy continues operating correctly; the error is logged at ERROR level with slog so operators are aware persistence was lost.

Certificate Management

  1. Algorithm Choice: ECDSA P-256 over RSA 2048/4096
  2. Rationale: Faster handshakes, smaller keys, equivalent security to RSA 3072, modern standard
  3. Validity Period: 10 years for operational simplicity in containerized environments
  4. Validation Strategy: Strict by default (secure by default), optional --insecure-certs for relaxed mode
  5. File Permissions: 0600 for private key (critical security), 0644 for certificate (public data)

Configuration Examples

Command-Line Usage with Flags

# Basic usage with defaults
./aiproxy --admin-secret "my-secure-secret-123"

# Without admin secret (WebUI login disabled, but certificate download still works)
./aiproxy

# Custom configuration
./aiproxy \
  --admin-secret "my-secure-secret-123" \
  --listen ":9090" \
  --webui-listen ":9091" \
  --blacklist-rules "/etc/aiproxy/blacklist.json" \
  --whitelist-rules "/etc/aiproxy/whitelist.json" \
  --rt-blacklist-rules "/var/lib/aiproxy/blacklist2.json" \
  --rt-whitelist-rules "/var/lib/aiproxy/whitelist2.json" \
  --log-level "debug" \
  --log-file "/var/log/aiproxy/aiproxy.log" \
  --log-max-size 50 \
  --log-max-backups 5 \
  --global-rate-limit 100 \
  --pending-timeout "180s"

# Using combined certificate and key file
./aiproxy \
  --admin-secret "my-secure-secret-123" \
  --tls-cert "./certs/combined.pem" \
  --tls-key "./certs/combined.pem"

Environment Variables Configuration

# Set configuration via environment variables
export AIPROXY_ADMIN_SECRET="my-secure-secret-123"
export AIPROXY_LISTEN=":8080"
export AIPROXY_WEBUI_LISTEN=":8081"
export AIPROXY_BLACKLIST_RULES="/etc/aiproxy/rules/blacklist.json"
export AIPROXY_WHITELIST_RULES="/etc/aiproxy/rules/whitelist.json"
export AIPROXY_RT_BLACKLIST_RULES="/var/lib/aiproxy/data/blacklist2.json"
export AIPROXY_RT_WHITELIST_RULES="/var/lib/aiproxy/data/whitelist2.json"
export AIPROXY_LOG_LEVEL="info"
export AIPROXY_LOG_FILE="/var/log/aiproxy/aiproxy.log"
export AIPROXY_LOG_MAX_SIZE="100"
export AIPROXY_LOG_MAX_BACKUPS="3"
export AIPROXY_GLOBAL_RATE_LIMIT="60"
export AIPROXY_PENDING_TIMEOUT="120s"

# Run with environment variables
./aiproxy

Container Usage (Environment Variables)

# Docker/Podman with environment variables and volume mounts
# Required mounts: rules, data. Certs directory depends on TLS cert/key paths.
podman run -d \
  -p 8080:8080 \
  -p 8081:8081 \
  -v ./rules:/etc/aiproxy/rules:Z \
  -v ./data:/var/lib/aiproxy/data:Z \
  -v ./certs:/certs:Z \
  -e AIPROXY_ADMIN_SECRET="my-secure-secret-123" \
  -e AIPROXY_LOG_LEVEL="info" \
  -e AIPROXY_GLOBAL_RATE_LIMIT="60" \
  aiproxy:latest

# Container with combined certificate file
podman run -d \
  -p 8080:8080 \
  -p 8081:8081 \
  -v ./rules:/etc/aiproxy/rules:Z \
  -v ./data:/var/lib/aiproxy/data:Z \
  -v ./certs:/certs:Z \
  -e AIPROXY_TLS_CERT="/certs/combined.pem" \
  -e AIPROXY_TLS_KEY="/certs/combined.pem" \
  -e AIPROXY_ADMIN_SECRET="my-secure-secret-123" \
  aiproxy:latest

Example whitelist.json

[
  {
    "method": "GET",
    "pattern": "https://api.openai.com/v1/**",
    "rpm": 10,
    "comment": "OpenAI API - limited to 10 req/min"
  },
  {
    "method": "POST",
    "pattern": "https://api.anthropic.com/v1/messages",
    "rpm": 5,
    "comment": "Anthropic Claude API"
  },
  {
    "method": "*",
    "pattern": "https://api.github.com/**",
    "comment": "GitHub API - uses global rate limit"
  }
]

Example blacklist.json

[
  "https://malicious.example.com/**",
  "POST https://*/admin/**"
]

Example access.log format

2026-03-27T10:15:30.123Z 10.0.0.5 GET https://api.openai.com/v1/chat/completions 200 1250ms allowed whitelist[0]
2026-03-27T10:15:31.456Z 10.0.0.5 POST https://malicious.example.com/api 403 2ms blocked_blacklist blacklist[0]
2026-03-27T10:15:35.789Z 10.0.0.6 GET https://unknown-api.com/endpoint 403 120015ms blocked_timeout pending[timeout]
2026-03-27T10:16:42.012Z 10.0.0.5 GET https://api.openai.com/v1/models 200 350ms rate_limited:5.5s whitelist[0]

WebUI Endpoints

Authentication

  • GET /login - Login page
  • POST /login - Submit admin secret
  • GET /logout - Logout

Dashboard

  • GET / - Main dashboard (public)
  • GET /api/dashboard/stream - SSE live stats stream (public)
  • GET /api/stats - JSON stats summary

Pending Requests

  • GET /pending - Pending requests page (requires auth)
  • GET /api/pending/stream - SSE stream for real-time updates
  • POST /api/pending/:id/approve - Approve pending request (add to whitelist2)
  • POST /api/pending/:id/deny - Deny pending request (add to blacklist2)

Rate-Limited Requests

  • GET /ratelimit - Rate-limited requests viewer page (requires auth)
  • GET /api/ratelimit/stream - SSE stream for real-time updates

Rules Management

  • GET /rules - Rules management page (requires auth)
  • GET /api/rules/whitelist - Get all whitelist rules (merged whitelist + whitelist2)
  • GET /api/rules/blacklist - Get all blacklist rules (merged blacklist + blacklist2)
  • POST /api/rules/whitelist - Add rule to whitelist2
  • DELETE /api/rules/whitelist/:id - Delete rule from whitelist2 (cannot delete from whitelist)
  • POST /api/rules/blacklist - Add rule to blacklist2
  • DELETE /api/rules/blacklist/:id - Delete rule from blacklist2

Logs & Certificates

  • GET /logs - Access log viewer page (requires auth)
  • GET /api/logs - Access log content (with offset/limit)
  • GET /download-cert - Download CA certificate (public, no auth)

Error Response Format

All proxy errors return JSON:

{
  "error": "forbidden",
  "reason": "not in whitelist",
  "request_id": "req_abc123def456"
}

Error types:

  • connect_blocked - CONNECT method not allowed (anti-tunneling protection)
  • localhost_blocked - Request targets localhost IP (SSRF protection)
  • forbidden - Blacklisted or pending timeout
  • timeout - Request timeout exceeded
  • internal_error - Proxy internal error
  • bad_gateway - Upstream connection/certificate error

HTTP Status Codes:

  • 200 - Success
  • 403 - Forbidden (localhost blocked, blacklist, pending timeout, not in whitelist after timeout)
  • 500 - Internal proxy error
  • 502 - Bad Gateway (upstream connection error, invalid/expired upstream certificate)
  • 504 - Gateway Timeout

Implementation Notes

Critical Design Constraints

  1. No Daemon Mode: Application runs in foreground (container-first design)
  2. Idiomatic Go: Clean, readable code over premature optimization
  3. Streaming-First: Use io.Copy, never buffer entire request/response bodies
  4. Simple Interval Rate Limiting: No complex token bucket, just sleep(remaining_interval)
  5. SSE for Real-Time: Use Server-Sent Events, not WebSockets or polling
  6. Glob Patterns Only: No regex support in v1 (glob covers most use cases)
  7. Per-Rule Stats: Track statistics at rule level, not per unique URL
  8. Text Access Logs: Fixed Apache-style format, not JSON (easier to grep)
  9. Wrapper Mode: Command execution uses os/exec.CommandContext for clean cancellation; all parent environment variables passed through plus proxy-specific vars; stdin/stdout/stderr forwarded directly; proxy shutdown via context cancellation when command exits

Concurrency Considerations

  • Each pending request holds a goroutine (acceptable for hundreds of requests)
  • Rate limiting uses simple mutexes per rule (not distributed, single instance only)
  • Stats updates are synchronized (mutex or atomic operations)
  • SSE connections: one goroutine per WebUI client (expected: 1-5 clients max)
  • Localhost IP resolution is per-request (no shared state, thread-safe via Go's net package)
  • Proxy listener synchronization: Channel-based (no mutex) - Start() assigns listener then closes listenerReady channel; Addr() blocks on channel receive, guaranteeing safe read via Go memory model happens-before semantics

Security Assumptions

  • Network Isolation: Proxy runs in isolated container, WebUI not exposed to public internet
  • Trusted Admin: Single admin user, no RBAC needed
  • AI Agent Clients: No client authentication (rely on network isolation)
  • CA Certificate Trust: Admin manually installs CA cert on AI agent machines

Performance Expectations (v1)

  • Concurrent Connections: Hundreds (not thousands)
  • Request Throughput: ~100-500 req/sec (sufficient for AI agent workloads)
  • Memory: ~50-200MB typical usage (no memory pooling)
  • Pending Queue: Unlimited until OOM (acceptable for small installations)

Known Limitations (v1)

  1. No graceful shutdown - connections may be dropped on container stop
  2. No distributed state - single instance only
  3. Rate limit state resets on restart
  4. No request/response body inspection (streaming means we don't buffer bodies)
  5. No fine-grained URL normalization (track per exact URL matched by rule)
  6. No client authentication (rely on network isolation)
  7. No concurrent pending request limit (could OOM with thousands of pendings)

Future Compatibility Notes

When implementing v2+ features from TODO.md:

  • Stats schema is extensible (add new fields without breaking existing)
  • Rule format supports adding new fields (e.g., priority, expires_at)
  • Access log format is append-only (safe to add new tools parsing it)
  • WebUI API is versioned (can add /api/v2/ endpoints later)