Skip to content

Add API benchmark suite#697

Open
alexgduarte wants to merge 1 commit into
SecureBananaLabs:mainfrom
alexgduarte:codex/api-benchmarks-30
Open

Add API benchmark suite#697
alexgduarte wants to merge 1 commit into
SecureBananaLabs:mainfrom
alexgduarte:codex/api-benchmarks-30

Conversation

@alexgduarte
Copy link
Copy Markdown

/claim #30

Summary

  • Added a dependency-light benchmark suite under benchmarks/ using Node's built-in fetch.
  • Covered /health plus all 20 mounted /api/* routes from apps/api/src/app.js.
  • Added root commands: npm run benchmark and npm run benchmark:smoke.
  • Added JSON and Markdown outputs under benchmarks/results/.
  • Added reviewable thresholds in benchmarks/thresholds.json.
  • Added CI smoke workflow at .github/workflows/benchmark-smoke.yml.
  • Fixed the existing API test script so npm test runs src/tests/*.test.js instead of trying to execute the test directory.

Validation

  • node --check benchmarks/run-benchmarks.mjs
  • npm test
  • npm run benchmark:smoke
  • npm run benchmark
  • git diff --cached --check

Benchmark Environment

Hardware

  • CPU model & core count: AMD Ryzen 7 9800X3D 8-Core Processor, 16 logical cores reported by Node.js
  • RAM (total & available during benchmark): 61.59 GB total, 32.53 GB free at full benchmark start
  • Storage type (SSD / NVMe / HDD): local workstation storage, exact type not exposed in this sandbox
  • Network interface (Ethernet / WiFi / loopback): loopback for local benchmark target
  • Machine type (local workstation / cloud VM / CI runner - include instance type if cloud): local workstation
  • OS & version: Windows win32 10.0.26200 x64

Runtime

  • Node.js version (or relevant runtime): v22.18.0, npm 11.7.0
  • Any resource limits applied (Docker memory cap, cgroup limits, etc.): none intentionally applied
  • Other significant processes running during benchmark (yes / no - if yes, describe): yes, normal Codex desktop and shell background activity

If submitted by or with an AI agent

  • Agent or tool name (e.g. Claude Code, Devin, Copilot Workspace, AutoGPT): OpenAI Codex desktop
  • Underlying model and version (e.g. claude-sonnet-4-5, gpt-4o - if known): GPT-5-based Codex model
  • Inference provider (e.g. Anthropic, OpenAI, Azure, self-hosted): OpenAI
  • Orchestration framework if any (e.g. LangChain, AutoGen, custom): none beyond Codex desktop shell/GitHub tooling
  • Execution mode (fully autonomous / human-supervised / human-initiated per step): human-initiated, agent-executed
  • Did the agent have shell/tool access during execution (yes / no): yes
  • Did the agent have internet access during execution (yes / no): yes, through approved GitHub/network commands
  • Were benchmark commands run by the agent directly or handed off to the human to run: run directly by the agent
  • Any known agent constraints or sandboxing that may have affected execution: local loopback benchmark only; external network and git operations required approval/escalation

Markdown Benchmark Summary

API Benchmark Report (full)

Generated: 2026-05-25T10:21:20.778Z
Target: local ephemeral Express server
Routes covered: 20 /api routes plus /health
Requests per endpoint: 5
Concurrency: 2

Environment

  • OS: win32 10.0.26200 x64
  • CPU: AMD Ryzen 7 9800X3D 8-Core Processor (16 logical cores)
  • Memory: 61.59 GB total, 32.53 GB free at start
  • Node: v22.18.0

Results

Endpoint Requests p50 ms p95 ms p99 ms p95 TTFB ms Sustained RPS Peak RPS Error %
GET /health 5 3.56 5.3 5.3 5.12 495.31 20 0
POST /api/auth/register 5 4.99 5.26 5.26 5.02 429.55 20 0
POST /api/auth/login 5 3.1 3.61 3.61 3.53 634.68 20 0
GET /api/auth/oauth/github/callback 5 1.55 2.12 2.12 2.05 1150.8 20 0
POST /api/auth/refresh 5 1.84 2.02 2.02 1.96 1014.45 20 0
GET /api/users 5 1.04 1.31 1.31 1.26 1690.79 20 0
POST /api/users 5 1.6 1.61 1.61 1.56 1277.11 20 0
GET /api/jobs 5 1.1 1.22 1.22 1.16 1673.81 20 0
POST /api/jobs 5 1.58 2.1 2.1 2.04 1157.73 20 0
GET /api/proposals 5 1.21 1.35 1.35 1.29 1477.93 20 0
POST /api/proposals 5 1.39 1.64 1.64 1.59 1155.19 20 0
POST /api/payments 5 1.53 1.63 1.63 1.57 1260.05 20 0
GET /api/reviews 5 1.11 1.22 1.22 1.15 1714.62 20 0
POST /api/reviews 5 1.38 1.61 1.61 1.55 1378.28 20 0
GET /api/messages 5 1.04 1.15 1.15 1.1 1821.23 20 0
POST /api/messages 5 1.36 1.65 1.65 1.59 1364.96 20 0
GET /api/notifications 5 1.03 1.14 1.14 1.07 1741.61 20 0
POST /api/notifications 5 1.36 1.61 1.61 1.56 1380.8 20 0
POST /api/uploads 5 2.21 6.08 6.08 6.03 499.95 20 0
GET /api/search 5 1.2 1.83 1.83 1.78 1346.98 20 0
GET /api/admin/metrics 5 1.85 2.58 2.58 2.53 941.6 20 0

Thresholds

All configured benchmark thresholds passed.

Demo Evidence

The benchmark JSON and Markdown reports are committed in benchmarks/results/. I could not attach a short video because this local environment does not have ffmpeg or ImageMagick available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant