Implement richer Prometheus metrics with latency and error counters
Description
GET /api/v1/metrics in src/index.ts exposes four gauges (services, api keys, outstanding usage, paused), hand-built as text lines. There are no request counters, latency histograms, or error counters, even though the request-timer middleware already measures durationMs. This issue expands the metrics so operators can actually observe traffic and latency.
Requirements and context
- Repository scope:
Agentpay-Org/Agentpay-backend only.
- Add
agentpay_http_requests_total{method,route,status} counters and an agentpay_http_request_duration_seconds histogram fed by the existing timer middleware.
- Add an
agentpay_http_errors_total counter incremented in the final error handler.
- Keep the existing gauges and the
text/plain; version=0.0.4 exposition format; consider using prom-client for correctness of histogram buckets.
- Ensure route labels use the matched route pattern (e.g.
/api/v1/usage/:agent/:serviceId), not the raw path, to bound cardinality.
Suggested execution
- Fork the repo and create a branch
git checkout -b feature/observability-08-prometheus-histograms
- Implement changes
- Write code in: the metrics endpoint and timer/error middleware in
src/index.ts, optionally a src/metrics.ts.
- Write comprehensive tests in: new
src/metrics.test.ts — counter increments, histogram presence, format validity.
- Add documentation: document the metric names in
docs/metrics.md.
- Add TSDoc on any metrics helpers.
- Validate security assumptions: no high-cardinality labels (no raw agent ids in labels).
- Test and commit
Test and commit
- Run
npm run build, npm test, and npm run lint.
- Cover edge cases: error path increments error counter, route normalization, exposition format parses.
- Include the full
npm test output in the PR description.
Example commit message
feat: expand prometheus metrics with latency histogram and error counters
Guidelines
- Minimum 95 percent test coverage for impacted modules.
- Clear, reviewer-focused documentation.
- Timeframe: 96 hours.
Community & contribution rewards
- 💬 Join the AgentPay community on Discord for questions, reviews, and faster merges: https://discord.gg/eXvRKkgcv
- ⭐ This is a GrantFox OSS / Official Campaign task and may be rewarded. When your PR is merged you'll be prompted to rate the project — if this issue and the maintainers helped you ship, we'd be grateful for a 5-star rating. Clear questions in Discord and tidy, well-tested PRs are the fastest path to a merge and a reward.
Implement richer Prometheus metrics with latency and error counters
Description
GET /api/v1/metricsinsrc/index.tsexposes four gauges (services, api keys, outstanding usage, paused), hand-built as text lines. There are no request counters, latency histograms, or error counters, even though the request-timer middleware already measuresdurationMs. This issue expands the metrics so operators can actually observe traffic and latency.Requirements and context
Agentpay-Org/Agentpay-backendonly.agentpay_http_requests_total{method,route,status}counters and anagentpay_http_request_duration_secondshistogram fed by the existing timer middleware.agentpay_http_errors_totalcounter incremented in the final error handler.text/plain; version=0.0.4exposition format; consider usingprom-clientfor correctness of histogram buckets./api/v1/usage/:agent/:serviceId), not the raw path, to bound cardinality.Suggested execution
git checkout -b feature/observability-08-prometheus-histogramssrc/index.ts, optionally asrc/metrics.ts.src/metrics.test.ts— counter increments, histogram presence, format validity.docs/metrics.md.Test and commit
npm run build,npm test, andnpm run lint.npm testoutput in the PR description.Example commit message
feat: expand prometheus metrics with latency histogram and error countersGuidelines
Community & contribution rewards