Commit 7de451f
uri: Do not copy and normalize already-normalized URIs for uri_parser_rfc3986 (#21726)
When Uri\Rfc3986\Uri::parse() produces a URI already in canonical form
(the common case: http/https URLs with no uppercase host, no
percent-encoding in unreserved ranges, no ".." path segments),
get_normalized_uri() no longer deep-copies the parsed struct and runs
a full normalization pass. It calls uriNormalizeSyntaxMaskRequiredExA
once to compute the dirty mask; a zero mask means we alias the raw
uri. The struct caches the dirty mask, so multiple non-raw reads on
the same instance only run the scan once.
Fallback: when the mask is nonzero, we copy and normalize as before,
but only for the flagged components (uriNormalizeSyntaxExMmA(...,
dirty_mask, ...) instead of (..., -1, ...)).
Measurements on a 17-URL mix with a realistic parse-and-read workload
(10 runs of 1.7M parses each, CPU pinned via taskset, same-session
stash-pop A/B so both builds share machine state):
baseline mean optimized mean delta
parse only 0.3992s (4.26M/s) 0.4083s (4.16M/s) noise
parse + 1 read 0.6687s (2.54M/s) 0.5464s (3.11M/s) -18.3%
parse + 7 reads 0.8510s (2.00M/s) 0.7305s (2.33M/s) -14.2%
The "parse + 1 read" row isolates the first-read cost where this
change lands. The "parse + 7 reads" row shows the amortized effect
under a realistic user pattern: the first getter pays the reduced
normalization cost, and the remaining six getters hit the cached
normalized uri and cost the same as before.
hyperfine cross-check on the whole benchmark script, 15 runs each:
baseline 20.471 s +/- 1.052 s [19.535 .. 22.985]
optimized 17.240 s +/- 0.540 s [16.556 .. 18.190]
optimized runs 1.19 +/- 0.07 times faster.
All 309 tests in ext/uri/tests pass. I checked that URIs needing
normalization (http://EXAMPLE.com/A/%2e%2e/c resolving to /c) still
hit the full normalize path through the nonzero dirty mask.
Co-authored-by: Tim Düsterhus <tim@bastelstu.be>1 parent 8172b7e commit 7de451f
2 files changed
Lines changed: 17 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
448 | 448 | | |
449 | 449 | | |
450 | 450 | | |
451 | | - | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
452 | 455 | | |
453 | 456 | | |
454 | 457 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
84 | 85 | | |
85 | 86 | | |
86 | 87 | | |
87 | | - | |
88 | | - | |
89 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
90 | 96 | | |
91 | 97 | | |
92 | 98 | | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
93 | 103 | | |
94 | 104 | | |
95 | 105 | | |
| |||
0 commit comments