Skip to content

Commit cf31239

Browse files
etrclaude
andcommitted
Merge TASK-028: routing-semantics regression gate
Pins the v1 routing-test corpus against the new 3-tier table as the release-blocker gate (AR-003, §9 testing item 5). Adds test/unit/routing_regression_test.cpp covering all six v1 taxonomy rows (exact, parameterized single/multi-segment, prefix, regex, method-mismatched) plus pure-miss baseline, cache-tier hit, and cache invalidation on unregister. Documents the corpus in test/REGRESSION.md including the v2 lookup-canonicalisation fix in webserver.cpp (handle_request now strips trailing slash + query/fragment before lookup) and the one explicit divergence (custom regexes — pinned with FIXME(TASK-036-prereq)). Wired into make check via test/Makefile.am. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 parents 370093d + ab89977 commit cf31239

7 files changed

Lines changed: 945 additions & 12 deletions

File tree

specs/tasks/M5-routing-lifecycle/TASK-028.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88
Run v1's full routing-test corpus against the new 3-tier table; treat any regression as a release-blocker.
99

1010
**Action Items:**
11-
- [ ] Inventory v1's existing routing tests (likely under `test/`); list every distinct routing pattern they cover (exact, parameterized with one segment, parameterized with multiple, prefix, regex, method-mismatched).
12-
- [ ] If any test was tightly coupled to v1's three-map internals, port it to the new public API; otherwise expect it to pass unchanged.
13-
- [ ] Run the full corpus against the new implementation and triage any failures: spec deviation (file ticket / fix architecture) vs. implementation bug (fix it).
14-
- [ ] Document the corpus as the v2.0 routing regression gate in `test/README` (or equivalent).
11+
- [x] Inventory v1's existing routing tests (likely under `test/`); list every distinct routing pattern they cover (exact, parameterized with one segment, parameterized with multiple, prefix, regex, method-mismatched).
12+
- [x] If any test was tightly coupled to v1's three-map internals, port it to the new public API; otherwise expect it to pass unchanged.
13+
- [x] Run the full corpus against the new implementation and triage any failures: spec deviation (file ticket / fix architecture) vs. implementation bug (fix it).
14+
- [x] Document the corpus as the v2.0 routing regression gate in `test/README` (or equivalent).
1515

1616
**Dependencies:**
1717
- Blocked by: TASK-027
@@ -27,4 +27,4 @@ Run v1's full routing-test corpus against the new 3-tier table; treat any regres
2727
**Related Requirements:** PRD-HDL-REQ-002, PRD-HDL-REQ-004
2828
**Related Decisions:** AR-003 (release-blocker risk), §9 testing item 5
2929

30-
**Status:** Not Started
30+
**Status:** Done

specs/tasks/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ Nominally: **13 sequential tasks**, each S–XL. Most other tasks parallelize of
110110
| TASK-025 | Lambda handler entry points `on_*` | M4 | Done | TASK-005, TASK-009, TASK-014 |
111111
| TASK-026 | Generic `webserver::route(method, path, handler)` | M4 | Done | TASK-005, TASK-025 |
112112
| TASK-027 | 3-tier route table with LRU cache | M5 | Done | TASK-005, TASK-014, TASK-021, TASK-024, TASK-025, TASK-026 |
113-
| TASK-028 | Routing-semantics regression gate | M5 | Not Started | TASK-027 |
113+
| TASK-028 | Routing-semantics regression gate | M5 | Done | TASK-027 |
114114
| TASK-029 | Naming consistency — `stop_and_wait`, `block_ip`/`unblock_ip` | M5 | Not Started | TASK-014 |
115115
| TASK-030 | `_handler` suffix renames + `explicit` constructor | M5 | Not Started | TASK-014 |
116116
| TASK-031 | Handler error-propagation contract (DR-009) | M5 | Not Started | TASK-027, TASK-030 |

specs/unworked_review_issues/2026-05-11_000726_task-028.md

Lines changed: 131 additions & 0 deletions
Large diffs are not rendered by default.

src/webserver.cpp

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1657,12 +1657,37 @@ void webserver_impl::invalidate_route_cache() {
16571657
// The method-set check (does the entry serve `method`?) lives at the
16581658
// dispatch site, NOT here, because the existing 405 + Allow: header
16591659
// path needs to see the entry even when no method bit matches.
1660+
// Canonicalize a lookup path the same way http_endpoint canonicalizes a
1661+
// registration path: strip a trailing '/' (unless the path IS just "/"),
1662+
// prepend '/' if missing. Registration stores keys under url_complete,
1663+
// which is produced by this same normalization (see http_endpoint.cpp
1664+
// ll. 60-67) — so lookup must canonicalize too or "/foo" and "/foo/"
1665+
// would never share an entry. Matches the v1 dispatch path, which
1666+
// constructs a non-registration http_endpoint at lookup time and so
1667+
// gets the same normalization for free.
1668+
static std::string canonicalize_lookup_path(const std::string& path) {
1669+
if (path.empty()) {
1670+
return "/";
1671+
}
1672+
std::string out = path;
1673+
if (out.front() != '/') {
1674+
out.insert(out.begin(), '/');
1675+
}
1676+
if (out.size() > 1 && out.back() == '/') {
1677+
out.pop_back();
1678+
}
1679+
return out;
1680+
}
1681+
16601682
webserver_impl::lookup_result
16611683
webserver_impl::lookup_v2(http_method method, const std::string& path) {
16621684
lookup_result result;
16631685

1664-
// Step 1: cache.
1665-
cache_key key{method, path};
1686+
const std::string lookup_path = canonicalize_lookup_path(path);
1687+
1688+
// Step 1: cache. Cache under the canonical key so /foo and /foo/
1689+
// share an entry.
1690+
cache_key key{method, lookup_path};
16661691
cache_value cached;
16671692
if (route_cache_v2.find(key, cached)) {
16681693
result.found = true;
@@ -1677,7 +1702,7 @@ webserver_impl::lookup_v2(http_method method, const std::string& path) {
16771702
std::shared_lock table_lock(route_table_mutex_);
16781703

16791704
// 2a. Exact tier — single hash probe.
1680-
auto exact_it = exact_routes_.find(path);
1705+
auto exact_it = exact_routes_.find(lookup_path);
16811706
if (exact_it != exact_routes_.end()) {
16821707
result.found = true;
16831708
result.tier = tier_hit::exact;
@@ -1688,7 +1713,7 @@ webserver_impl::lookup_v2(http_method method, const std::string& path) {
16881713
// 2b. Radix tier — segment-trie walk.
16891714
if (!result.found) {
16901715
radix_match<route_entry> rm;
1691-
if (param_and_prefix_routes_.find(path, rm) && rm.entry) {
1716+
if (param_and_prefix_routes_.find(lookup_path, rm) && rm.entry) {
16921717
result.found = true;
16931718
result.tier = tier_hit::radix;
16941719
result.entry = *rm.entry;
@@ -1701,7 +1726,7 @@ webserver_impl::lookup_v2(http_method method, const std::string& path) {
17011726
// and on_methods_), so no compilation cost is paid per lookup.
17021727
if (!result.found) {
17031728
for (const auto& rr : regex_routes_) {
1704-
if (std::regex_match(path, rr.compiled_re)) {
1729+
if (std::regex_match(lookup_path, rr.compiled_re)) {
17051730
result.found = true;
17061731
result.tier = tier_hit::regex;
17071732
result.entry = rr.entry;

test/Makefile.am

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ LDADD += -lcurl
2626

2727
AM_CPPFLAGS = -I$(top_srcdir)/src -I$(top_srcdir)/src/httpserver/ -DHTTPSERVER_COMPILATION
2828
METASOURCES = AUTO
29-
check_PROGRAMS = basic file_upload http_utils threaded nodelay string_utilities http_endpoint ban_system ws_start_stop authentication deferred http_resource http_response create_webserver new_response_types daemon_info uri_log feature_unavailable header_hygiene_iovec header_hygiene iovec_entry http_method constants body http_response_sbo http_response_factories webserver_pimpl http_request_pimpl create_test_request http_request_arena http_request_const_getters http_request_tls_accessors webserver_register_smartptr webserver_register_path_prefix webserver_on_methods webserver_route route_table lookup_pipeline route_table_concurrency
29+
check_PROGRAMS = basic file_upload http_utils threaded nodelay string_utilities http_endpoint ban_system ws_start_stop authentication deferred http_resource http_response create_webserver new_response_types daemon_info uri_log feature_unavailable header_hygiene_iovec header_hygiene iovec_entry http_method constants body http_response_sbo http_response_factories webserver_pimpl http_request_pimpl create_test_request http_request_arena http_request_const_getters http_request_tls_accessors webserver_register_smartptr webserver_register_path_prefix webserver_on_methods webserver_route route_table lookup_pipeline route_table_concurrency routing_regression
3030

3131
MOSTLYCLEANFILES = *.gcda *.gcno *.gcov
3232

@@ -223,6 +223,18 @@ lookup_pipeline_LDADD = $(LDADD) -lmicrohttpd
223223
route_table_concurrency_SOURCES = integ/route_table_concurrency.cpp
224224
route_table_concurrency_LDADD = $(LDADD) -lmicrohttpd
225225

226+
# routing_regression: TASK-028. v1 routing-test corpus regression gate.
227+
# Drives the public webserver registration surface (register_path /
228+
# register_prefix / on_get / on_post / unregister_path / unregister_prefix)
229+
# and probes webserver_impl::lookup_v2 to assert that every routing
230+
# pattern in the v1 corpus resolves to the same handler / tier / captures
231+
# the v2 3-tier table is expected to produce after TASK-036 cuts dispatch
232+
# over. Documented as the v2.0 routing regression gate in
233+
# test/REGRESSION.md. Needs -lmicrohttpd because webserver's PIMPL impl
234+
# carries microhttpd-typed members.
235+
routing_regression_SOURCES = unit/routing_regression_test.cpp
236+
routing_regression_LDADD = $(LDADD) -lmicrohttpd
237+
226238
noinst_HEADERS = littletest.hpp
227239
AM_CXXFLAGS += -Wall -fPIC -Wno-overloaded-virtual
228240

test/REGRESSION.md

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
# v2.0 Routing Regression Gate
2+
3+
Per architecture §9 (testing) item 5 and TASK-028 (AR-003 release-blocker
4+
risk): the v1 routing-test corpus must pass against the v2.0
5+
implementation without regression. Any divergence from v1 routing
6+
semantics is either documented here with rationale **or fixed**.
7+
8+
The gate is run automatically as part of `make check`. The bespoke v2
9+
parity TU lives at [`test/unit/routing_regression_test.cpp`](unit/routing_regression_test.cpp)
10+
and is wired in via [`test/Makefile.am`](Makefile.am) as
11+
`check_PROGRAMS += routing_regression`. Manual selective run:
12+
13+
```sh
14+
cd build/test && ./routing_regression
15+
```
16+
17+
## Why two surfaces
18+
19+
Today, dispatch in `webserver_impl::finalize_answer` still walks the v1
20+
registration maps (`registered_resources_str`, `registered_resources`,
21+
`registered_resources_regex`). The v2 3-tier table (TASK-027) is
22+
**shadow-populated** by every `register_path` / `register_prefix` /
23+
`on_methods_` / `unregister_*` call but does not yet drive requests —
24+
that cutover is TASK-036.
25+
26+
So this gate protects two distinct surfaces:
27+
28+
1. **End-to-end dispatch via v1 maps.** The full v1 routing corpus
29+
continues to round-trip through curl in `test/integ/basic.cpp` plus
30+
the new TASK-024/025/026 unit suites. These pass today and must
31+
continue to pass.
32+
2. **v2 3-tier table semantics, ahead of TASK-036's dispatch cutover.**
33+
`routing_regression_test.cpp` is the only thing pinning v2-lookup
34+
correctness before dispatch flips. A failure here is a release
35+
blocker even though no end-user request currently routes through
36+
the v2 table.
37+
38+
## Pattern taxonomy
39+
40+
Every routing pattern in the v1 corpus, mapped to the v2 file that hosts
41+
it (or its v2-API port). When you add a new routing pattern, add a row
42+
here AND a test in `routing_regression_test.cpp`.
43+
44+
| Pattern | v1 test that pins it | v2 hosting file | Status |
45+
|---|---|---|---|
46+
| Exact path | `basic_suite::two_endpoints`, `duplicate_endpoints`, `request_with_*` | `test/integ/basic.cpp` | API-ported |
47+
| Exact root `/` | `http_endpoint_suite::http_endpoint_root_only` | `test/unit/http_endpoint_test.cpp` | unchanged |
48+
| Single-segment param `/{arg}` | `basic_suite::regex_matching_arg`, `regex_matching_arg_with_url_pars`, `http_endpoint_*` unit tests | `test/integ/basic.cpp`, `test/unit/http_endpoint_test.cpp` | API-ported |
49+
| Multi-segment param `/{a}/{b}` | `http_endpoint_suite::http_endpoint_multiple_params` | `test/unit/http_endpoint_test.cpp` | unchanged |
50+
| Custom-regex param `/{arg|([0-9]+)}` | `basic_suite::regex_matching_arg_custom`, `http_endpoint_registration_arg_custom_regex` | `test/integ/basic.cpp`, `test/unit/http_endpoint_test.cpp` | API-ported (constraint enforced via v1 map); v2 radix tier does NOT enforce per-segment constraint — see "Documented divergences" below |
51+
| Prefix (family) | `basic_suite::family_endpoints`, `non_family_url_with_regex_like_pieces`, `single_resource_mode` | `test/integ/basic.cpp` | API-ported (`register_resource(..., true)``register_prefix`) |
52+
| Regex (anchored, no `{}`) | `basic_suite::regex_matching`, `regex_url_exact_match`, `url_with_regex_like_pieces` | `test/integ/basic.cpp` | API-ported |
53+
| Regex-checking disabled | `basic_suite::*` with `no_regex_checking()` | `test/integ/basic.cpp` | unchanged |
54+
| Method-mismatched (405 + Allow) | `basic_suite::method_not_allowed_header`, `head_request`, `options_request`, `trace_request`, `only_render_*`, `custom_method_not_allowed_handler` | `test/integ/basic.cpp` | API-ported |
55+
| Lambda-only registration (`on_get` / `on_post` / ...) | (new in v2) | `test/unit/webserver_on_methods_test.cpp` (TASK-025) | new |
56+
| Generic `route(http_method, ...)` / `route(method_set, ...)` | (new in v2) | `test/unit/webserver_route_test.cpp` (TASK-026) | new |
57+
| Register / unregister cycles | `basic_suite::register_unregister`, `unregister_then_404`, `unregister_path` (was `unregister_resource`) | `test/integ/basic.cpp` | API-ported (one rename) |
58+
| Overlapping (regex vs regex; exact vs prefix; exact vs regex) | `basic_suite::overlapping_endpoints` | `test/integ/basic.cpp` | API-ported; v2 changes the precedence story — see "Documented divergences" |
59+
60+
The v2 parity TU itself (`routing_regression_test.cpp`) carries one
61+
`LT_BEGIN_AUTO_TEST` per pattern row. Each test drives the public
62+
registration surface and probes `webserver_impl::lookup_v2()` (via the
63+
`webserver_test_access` friend hook gated on `HTTPSERVER_COMPILATION`)
64+
to pin tier classification, method_set composition, captures, and
65+
prefix flagging.
66+
67+
## Documented divergences
68+
69+
These are v2 semantics that deliberately differ from v1, with the
70+
rationale. The corresponding assertions in `routing_regression_test.cpp`
71+
are pinned to v2 behavior; if v2 ever changes, the test edit makes the
72+
new contract explicit rather than silent.
73+
74+
### 1. `*_nonexistent_method` tests removed (TASK-021)
75+
76+
v1 had two tests in `http_resource_test.cpp`
77+
`set_allowing_nonexistent_method` and `is_allowed_nonexistent_method`
78+
that exercised allowing/disallowing methods by string name including
79+
strings outside the `http_method` enum. v2 replaced the boolean
80+
per-method map with the `method_set` bitmask (DR-021). The bitmask
81+
makes "nonexistent method by name" structurally unreachable, so the
82+
tests were dropped. Two more were renamed:
83+
84+
- `resource_init_sets_all_methods``default_state_all_methods_set`
85+
- `get_allowed_methods_only_returns_true``get_allowed_methods_only_returns_set`
86+
87+
One new test was added: `set_allowing_count_sentinel_has_no_effect`.
88+
89+
No port required.
90+
91+
### 2. `unregister_resource``unregister_path` rename (TASK-024)
92+
93+
v2 splits resource registration into the public `register_path` /
94+
`register_prefix` pair (TASK-024). The symmetric removal API renames
95+
to match: `unregister_resource(path)``unregister_path(path)`, with
96+
a new `unregister_prefix(path)` for the prefix half. The old name is
97+
preserved as a `[[deprecated]]` forwarder for source compatibility.
98+
The v1 test was renamed in place; behavior unchanged.
99+
100+
### 3. Custom-regex parameter constraints NOT enforced by the radix tier
101+
102+
In v1, `/items/{id|([0-9]+)}` is enforced by the
103+
`registered_resources_regex` map: the per-segment `[0-9]+` constraint
104+
participates in `std::regex_match`, so `/items/abc` does not match the
105+
route.
106+
107+
In v2, the radix-tree tier treats `{id|([0-9]+)}` as a single wildcard
108+
segment with name `id|([0-9]+)` and does NOT consult the constraint
109+
during the wildcard descent. So `/items/abc` and `/items/42` both
110+
resolve to the same radix entry today.
111+
112+
**Why this is acceptable for the gate**: end-to-end dispatch is still
113+
served by v1 maps (TASK-036 has not landed), so user-visible 404
114+
behavior is unchanged. The v2 divergence is only visible to TASK-028's
115+
parity probes against `lookup_v2`. The pinned test
116+
(`parameterized_with_custom_regex_lands_in_radix_tier`) asserts the
117+
current v2 behavior so silent drift is impossible.
118+
119+
**Follow-up**: adding per-segment regex constraints to the radix tier
120+
is a discrete, scoped piece of work (PRD §3.7 retrieval semantics). It
121+
must land **before** TASK-036 cuts dispatch over to `lookup_v2`, or
122+
the cutover will introduce a user-visible regression. Tracked as
123+
follow-up scope on TASK-036.
124+
125+
### 4. Overlapping-routes precedence: v1 iteration-order accident → v2 deterministic structural precedence
126+
127+
v1's `basic_suite::overlapping_endpoints` documents:
128+
129+
```
130+
LT_CHECK_EQ(s, "2"); // Not sure why regex wins, but it does...
131+
```
132+
133+
— i.e., when two parameterized routes both match `/foo/bar/`, v1 picks
134+
the second-registered one due to `std::map` iteration order. This is
135+
explicitly called out in the v1 test comment as an unintended
136+
accident, not a designed behavior.
137+
138+
v2 gives a deterministic structural precedence: the tier order is
139+
exact → radix → regex, and within the radix tier the trie walk visits
140+
exact children before wildcard children. Two wildcard-rooted routes
141+
that both match are resolved by first-registration insertion order
142+
into the wildcard chain. The pinned test
143+
(`overlapping_two_regex_routes_deterministic_first_wins`) asserts that
144+
the resolved handler is one of the two registered `shared_ptr`s
145+
without depending on which one — both because the v2 precedence is the
146+
new canonical behavior and because asserting "v1's accidental winner"
147+
would re-encode an explicit v1 bug into v2.
148+
149+
The companion test (`later_exact_registration_shadows_earlier_regex`)
150+
pins the cleaner half of `basic_suite::overlapping_endpoints`: when an
151+
exact route is later registered on the same path, it shadows the
152+
earlier parameterized route. The v2 lookup pipeline gives this for
153+
free because exact tier is walked first.
154+
155+
### 5. Trailing-slash and leading-slash canonicalization in `lookup_v2`
156+
157+
v1 canonicalizes both the registered path and the incoming request
158+
path by constructing `http_endpoint(path, family, registration=false)`
159+
at dispatch time — strip trailing `/`, prepend `/` if absent, then
160+
match. So `/ok`, `/ok/`, `ok`, and `ok/` are all equivalent at
161+
dispatch.
162+
163+
The v2 `lookup_v2` initially took the raw path string and probed
164+
`exact_routes_` / radix / regex directly, which made `/ok` (stored
165+
canonical) miss against an `/ok/` request. TASK-028 fixed this in
166+
`webserver.cpp` by adding `canonicalize_lookup_path()` and applying
167+
it at the head of `lookup_v2`, including cache keying. This brings v2
168+
in line with v1 semantics; the test
169+
`exact_path_normalization_aliases` pins it.
170+
171+
## How to extend
172+
173+
When adding a new routing pattern to the public API:
174+
175+
1. Add an `LT_BEGIN_AUTO_TEST` to `test/unit/routing_regression_test.cpp`
176+
driving the public registration surface and probing `lookup_v2()`.
177+
2. Add a row to the **Pattern taxonomy** table above.
178+
3. If the new pattern intentionally diverges from v1 semantics, add a
179+
subsection under **Documented divergences** with the rationale.
180+
181+
## Cross-references
182+
183+
- [`specs/architecture/09-testing.md`](../specs/architecture/09-testing.md) item 5 — testing strategy.
184+
- [`specs/architecture/12-open-questions.md`](../specs/architecture/12-open-questions.md) AR-003 — release-blocker risk.
185+
- [`specs/tasks/M5-routing-lifecycle/TASK-028.md`](../specs/tasks/M5-routing-lifecycle/TASK-028.md) — this gate's source task.
186+
- [`specs/tasks/M5-routing-lifecycle/TASK-027.md`](../specs/tasks/M5-routing-lifecycle/TASK-027.md) — the 3-tier table this gate validates.
187+
- [`specs/tasks/M5-routing-lifecycle/TASK-036.md`](../specs/tasks/M5-routing-lifecycle/TASK-036.md) — downstream consumer (dispatch cutover).

0 commit comments

Comments
 (0)