Skip to content

Commit 517c08e

Browse files
committed
feat(statapi): add /wait0 stats endpoint and refresh metrics
- add authenticated GET /wait0 and /wait0/ control endpoint with stats:read scope - add 5s snapshot-cached stats payload for cache, memory, sitemap, and refresh durations - instrument revalidation durations and expose min/avg/max as refresh_duration_ms - extend cache metadata snapshots for efficient aggregate calculations - update debug config, docs, architecture notes, and tests for new endpoint behavior
1 parent b379223 commit 517c08e

24 files changed

+905
-35
lines changed

AGENTS.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,11 @@ wait0 is an ultra-fast cache-first HTTP reverse proxy written in Go that serves
6363
| API Endpoints | docs/api-endpoints.md | Endpoint and response reference |
6464
| Docker Hub notes | DOCKERHUB.md | Alias to README for Docker Hub presentation |
6565

66+
## AI Context Files
67+
| File | Description |
68+
|------|-------------|
69+
| .ai-factory/ARCHITECTURE.md | Architecture decisions and guidelines |
70+
6671
## Build & Development Commands
6772
This project uses a `Makefile` for build automation.
6873

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,17 @@ curl -i \
8484
Returns `202 Accepted` and processes invalidation asynchronously.
8585
Authorization is scope-based (`invalidation:write`) to support least-privilege tokens and future API growth without changing token model.
8686

87+
## Stats API Example
88+
89+
```bash
90+
curl -i \
91+
"http://localhost:8082/wait0" \
92+
-H "Authorization: Bearer ${WAIT0_STATS_TOKEN}"
93+
```
94+
95+
Returns `200 OK` with cache/memory/refresh/sitemap metrics snapshot.
96+
Authorization is scope-based (`stats:read`).
97+
8798
## Documentation
8899

89100
| Guide | Description |

debug/wait0.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,10 @@ server:
1919
auth:
2020
tokens:
2121
- id: debug-backoffice
22-
token: debug-invalidation-token
22+
token: debug-token
2323
scopes:
2424
- invalidation:write
25+
- stats:read
2526

2627
urlsDiscover:
2728
initalDelay: '20s'

docs/api-endpoints.md

Lines changed: 96 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Complete HTTP endpoint reference for `wait0`.
1010

1111
- A reverse-proxy data path for regular client requests.
1212
- A control endpoint for asynchronous cache invalidation.
13+
- A control endpoint for read-only runtime/cache statistics.
1314

1415
Base URL examples:
1516

@@ -62,7 +63,101 @@ curl -i "http://localhost:8082/"
6263
Typical first request: `X-Wait0: miss`.
6364
Subsequent request: `X-Wait0: hit`.
6465

65-
## 2) Invalidation API
66+
## 2) Stats API
67+
68+
## Route
69+
70+
- `GET /wait0`
71+
- `GET /wait0/`
72+
73+
## Auth
74+
75+
- `Authorization: Bearer <token>` required.
76+
- Token must map to scope: `stats:read`.
77+
78+
## Behavior
79+
80+
- Returns a cached metrics snapshot (recomputed at most once every 5 seconds).
81+
- Designed for dashboard/backoffice polling with bounded server overhead.
82+
- `refresh_duration_ms` is calculated from observed revalidation execution durations (min/avg/max), not cache entry age.
83+
- Duration aggregates are process-lifetime metrics (since current process start).
84+
85+
## Successful response
86+
87+
Status: `200 OK`
88+
89+
```json
90+
{
91+
"generated_at": "2026-03-05T10:00:00Z",
92+
"snapshot_ttl_seconds": 5,
93+
"cache": {
94+
"urls_total": 123,
95+
"responses_size_bytes_total": 456789,
96+
"response_size_bytes": {
97+
"min": 128,
98+
"avg": 1024,
99+
"max": 4096
100+
}
101+
},
102+
"memory": {
103+
"rss_bytes": 12345678,
104+
"go_alloc_bytes": 2345678
105+
},
106+
"refresh_duration_ms": {
107+
"min": 19,
108+
"avg": 66,
109+
"max": 119
110+
},
111+
"sitemap": {
112+
"discovered_urls": 80,
113+
"crawled_urls": 60,
114+
"crawl_percentage": 75
115+
}
116+
}
117+
```
118+
119+
## Response field reference
120+
121+
The table below explains each field in the stats payload, including what it means and how it is computed.
122+
123+
| Field | Type | Meaning | Calculation | Update behavior / notes |
124+
|------|------|---------|-------------|-------------------------|
125+
| `generated_at` | RFC3339Nano string | UTC timestamp when this snapshot was generated. | `time.Now().UTC()` at snapshot build time. | New value only when snapshot is recomputed. |
126+
| `snapshot_ttl_seconds` | integer | Snapshot cache TTL used by `/wait0`. | Fixed constant `5`. | Endpoint may return identical payload for calls within this TTL. |
127+
| `cache.urls_total` | integer | Total number of unique cached keys currently known to wait0. Includes active + inactive entries. | Unique union of RAM keys and disk keys. | Recomputed per snapshot. |
128+
| `cache.responses_size_bytes_total` | integer (bytes) | Total logical size of cached responses for all unique keys. | Sum over unique keys of per-entry logical size (`headers + body` bytes). | Recomputed per snapshot. |
129+
| `cache.response_size_bytes.min` | integer (bytes) | Smallest logical response size among unique cached keys. | Min of per-key logical response size. | Recomputed per snapshot; `0` when no keys. |
130+
| `cache.response_size_bytes.avg` | integer (bytes) | Average logical response size among unique cached keys. | `responses_size_bytes_total / urls_total` (integer division). | Recomputed per snapshot; `0` when no keys. |
131+
| `cache.response_size_bytes.max` | integer (bytes) | Largest logical response size among unique cached keys. | Max of per-key logical response size. | Recomputed per snapshot; `0` when no keys. |
132+
| `memory.rss_bytes` | integer (bytes) | Current process resident memory (RSS) as seen by OS probes. | `ProcessRSSBytes()`; `0` when unavailable on platform/runtime. | Recomputed per snapshot. |
133+
| `memory.go_alloc_bytes` | integer (bytes) | Current heap bytes allocated by Go runtime. | `runtime.ReadMemStats(&ms); ms.Alloc`. | Recomputed per snapshot. |
134+
| `refresh_duration_ms.min` | integer (ms) | Fastest observed revalidation execution time. | Min of observed `revalidation.Once(...)` durations, converted to milliseconds. | Process-lifetime aggregate since current process start. |
135+
| `refresh_duration_ms.avg` | integer (ms) | Average observed revalidation execution time. | Sum of all observed revalidation durations / count, converted to ms (integer division). | Process-lifetime aggregate since current process start. |
136+
| `refresh_duration_ms.max` | integer (ms) | Slowest observed revalidation execution time. | Max of observed `revalidation.Once(...)` durations, converted to milliseconds. | Process-lifetime aggregate since current process start. |
137+
| `sitemap.discovered_urls` | integer | Number of unique cached keys whose discovery source is sitemap. | Count of unique keys where `discovered_by == "sitemap"` (case-insensitive). | Recomputed per snapshot. |
138+
| `sitemap.crawled_urls` | integer | Number of sitemap-discovered keys that are currently active (not inactive seed entries). | Count of sitemap keys where `inactive == false`. | Recomputed per snapshot. |
139+
| `sitemap.crawl_percentage` | float | Share of sitemap-discovered keys currently crawled/active. | `crawled_urls * 100 / discovered_urls`; `0` if `discovered_urls == 0`. | Recomputed per snapshot. |
140+
141+
### Additional interpretation notes
142+
143+
- Snapshot caching: `/wait0` returns cached stats for up to `snapshot_ttl_seconds`; polling faster than TTL will often return unchanged values.
144+
- Lifetime vs point-in-time:
145+
- `refresh_duration_ms.*` is lifetime cumulative for this process (does not reset per warmup batch).
146+
- `cache.*`, `memory.*`, `sitemap.*` are point-in-time values at snapshot generation.
147+
- Duplicate keys across RAM and disk are deduplicated as one logical cached URL in all `cache.*` and `sitemap.*` counts.
148+
- Size units:
149+
- `*_bytes` fields are raw bytes.
150+
- `refresh_duration_ms` is milliseconds.
151+
152+
## Error responses
153+
154+
| HTTP | Body `error` | Cause |
155+
|------|--------------|-------|
156+
| `401` | `unauthorized` | Missing/invalid bearer token |
157+
| `403` | `forbidden` | Token exists but lacks `stats:read` scope |
158+
| `405` | `method not allowed` | Non-GET request |
159+
160+
## 3) Invalidation API
66161

67162
## Route
68163

docs/for-developers.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,8 @@ Validation rules: all numeric values above must be `> 0`.
104104

105105
For invalidation API, at least one token must have scope `invalidation:write` when invalidation is enabled.
106106

107+
For stats API (`GET /wait0`), tokens need scope `stats:read`.
108+
107109
## `rules[]`
108110

109111
| Field | Required | Notes |

internal/wait0/cache/disk.go

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,12 @@ import (
1313
)
1414

1515
type diskMeta struct {
16-
Size int64
17-
LastAccess int64
16+
Size int64
17+
LastAccess int64
18+
StatsSize int64
19+
Inactive bool
20+
DiscoveredBy string
21+
LastRefresh int64
1822
}
1923

2024
type diskOp struct {
@@ -75,6 +79,30 @@ func (d *Disk) SnapshotAccessTimes() map[string]int64 {
7579
return out
7680
}
7781

82+
func (d *Disk) MetaSnapshot() map[string]EntryMeta {
83+
d.mu.Lock()
84+
defer d.mu.Unlock()
85+
out := make(map[string]EntryMeta, len(d.index))
86+
for k, m := range d.index {
87+
lastRefresh := m.LastRefresh
88+
if lastRefresh <= 0 {
89+
// Backward compatibility for metadata written before LastRefresh existed.
90+
lastRefresh = m.LastAccess * int64(time.Second)
91+
}
92+
size := m.StatsSize
93+
if size <= 0 {
94+
size = m.Size
95+
}
96+
out[k] = EntryMeta{
97+
Size: size,
98+
Inactive: m.Inactive,
99+
DiscoveredBy: m.DiscoveredBy,
100+
LastRefreshUnixNano: lastRefresh,
101+
}
102+
}
103+
return out
104+
}
105+
78106
func (d *Disk) TotalSize() int64 {
79107
d.mu.Lock()
80108
defer d.mu.Unlock()
@@ -207,6 +235,11 @@ func (d *Disk) applyPutOrTouch(key string, ent *Entry) {
207235
return
208236
}
209237
size := int64(len(b))
238+
statsSize := EntryLogicalSize(*ent)
239+
lastRefresh := ent.RevalidatedAt
240+
if lastRefresh <= 0 && ent.StoredAt > 0 {
241+
lastRefresh = ent.StoredAt * int64(time.Second)
242+
}
210243

211244
d.mu.Lock()
212245
old := d.index[key]
@@ -215,6 +248,10 @@ func (d *Disk) applyPutOrTouch(key string, ent *Entry) {
215248
}
216249
meta.Size = size
217250
meta.LastAccess = now
251+
meta.StatsSize = statsSize
252+
meta.Inactive = ent.Inactive
253+
meta.DiscoveredBy = ent.DiscoveredBy
254+
meta.LastRefresh = lastRefresh
218255
d.index[key] = meta
219256
d.totalSize += size
220257
total := d.totalSize

internal/wait0/cache/ram.go

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ type ramItem struct {
1313
key string
1414
ent Entry
1515
size int64
16+
statsSize int64
1617
lastAccess int64
1718
prev *ramItem
1819
next *ramItem
@@ -91,6 +92,7 @@ func (c *RAM) Put(key string, ent Entry, disk *Disk, overflowLog Logger) {
9192
return
9293
}
9394
sz := int64(len(b))
95+
statsSize := EntryLogicalSize(ent)
9496

9597
if c.maxBytes > 0 && sz > c.maxBytes {
9698
if disk != nil {
@@ -107,6 +109,7 @@ func (c *RAM) Put(key string, ent Entry, disk *Disk, overflowLog Logger) {
107109
c.total -= it.size
108110
it.ent = ent
109111
it.size = sz
112+
it.statsSize = statsSize
110113
it.lastAccess = now
111114
c.total += sz
112115
c.moveToFront(it)
@@ -126,7 +129,7 @@ func (c *RAM) Put(key string, ent Entry, disk *Disk, overflowLog Logger) {
126129
}
127130
}
128131

129-
it := &ramItem{key: key, ent: ent, size: sz, lastAccess: now}
132+
it := &ramItem{key: key, ent: ent, size: sz, statsSize: statsSize, lastAccess: now}
130133
c.items[key] = it
131134
c.addToFront(it)
132135
c.total += sz
@@ -142,6 +145,26 @@ func (c *RAM) SnapshotAccessTimes() map[string]int64 {
142145
return out
143146
}
144147

148+
func (c *RAM) MetaSnapshot() map[string]EntryMeta {
149+
c.mu.Lock()
150+
defer c.mu.Unlock()
151+
out := make(map[string]EntryMeta, len(c.items))
152+
for k, it := range c.items {
153+
lastRefresh := it.ent.RevalidatedAt
154+
if lastRefresh <= 0 && it.ent.StoredAt > 0 {
155+
lastRefresh = it.ent.StoredAt * int64(time.Second)
156+
}
157+
out[k] = EntryMeta{
158+
Size: it.statsSize,
159+
Inactive: it.ent.Inactive,
160+
DiscoveredBy: it.ent.DiscoveredBy,
161+
LastRefreshUnixNano: lastRefresh,
162+
StoredAtUnix: it.ent.StoredAt,
163+
}
164+
}
165+
return out
166+
}
167+
145168
func (c *RAM) SetLastAccessForTest(key string, ts int64) bool {
146169
c.mu.Lock()
147170
defer c.mu.Unlock()

internal/wait0/cache/types.go

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,29 @@ type Entry struct {
1515
RevalidatedAt int64
1616
RevalidatedBy string
1717
}
18+
19+
type EntryMeta struct {
20+
// Size is logical response size in bytes (headers + body).
21+
Size int64
22+
23+
Inactive bool
24+
DiscoveredBy string
25+
26+
// LastRefreshUnixNano is unix nanos timestamp of latest refresh.
27+
// May be zero for legacy entries.
28+
LastRefreshUnixNano int64
29+
30+
// StoredAtUnix is unix seconds timestamp.
31+
StoredAtUnix int64
32+
}
33+
34+
func EntryLogicalSize(ent Entry) int64 {
35+
total := int64(len(ent.Body))
36+
for k, vals := range ent.Header {
37+
total += int64(len(k))
38+
for _, v := range vals {
39+
total += int64(len(v))
40+
}
41+
}
42+
return total
43+
}

internal/wait0/cache_disk.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ func (d *diskCache) SnapshotAccessTimes() map[string]int64 {
2222
return d.inner.SnapshotAccessTimes()
2323
}
2424

25+
func (d *diskCache) MetaSnapshot() map[string]cache.EntryMeta {
26+
return d.inner.MetaSnapshot()
27+
}
28+
2529
func (d *diskCache) TotalSize() int64 {
2630
return d.inner.TotalSize()
2731
}

internal/wait0/cache_ram.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,10 @@ func (c *ramCache) SnapshotAccessTimes() map[string]int64 {
5050
return c.inner.SnapshotAccessTimes()
5151
}
5252

53+
func (c *ramCache) MetaSnapshot() map[string]cache.EntryMeta {
54+
return c.inner.MetaSnapshot()
55+
}
56+
5357
func (c *ramCache) setLastAccessForTest(key string, ts int64) bool {
5458
return c.inner.SetLastAccessForTest(key, ts)
5559
}

0 commit comments

Comments
 (0)