Regex nfa dfa benchmarks by sayrer · Pull Request #527 · timbray/quamina

sayrer · 2026-05-18T18:14:50Z

These are tests that subsume #492.

Previously patterns and events drew emojis from the same pool but independently, so events with random emoji pairs rarely matched any of the random pattern pairs (especially at patterns=32/64), tripping the per-iteration b.Fatalf assertion. Track the (e1, e2) pairs used to build patterns; sample events from that same set so every event matches at least one pattern. The benchmark still measures NFA traversal cost on dense multi-byte UTF-8 input — the only thing that changed is correctness of the match-presence sanity check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds stable baseline benchmarks for representative match-time workloads: - ExactString: 1 exact pattern - SingleShellstyle: 1 wildcard pattern - ManyOverlappingWildcards: N=8..128 overlapping wildcards - RegexAlternation: 20 regex patterns with alternation - LiteralInRegex: literal substring inside regex - QuantifiedCharClass: regex with {n,m} quantifier - ManyAnchoredRegex: 200 anchored regex patterns - DeepEpsilonNest: regex with nested alternation/quantifiers - CacheThrashing: adversarial input over wide state space - ParallelMatchers: 8..64 goroutines via Copy() Each warms the matcher with ~100 iterations before resetting the timer so first-call laziness does not pollute steady-state measurements. These are intended as stable workload baselines: subsequent matcher optimization work can be evaluated by re-running these benchmarks unchanged and comparing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

timbray · 2026-05-29T23:57:52Z

+	for _, sp := range simplePatterns {
+		b.Run(sp.name, func(b *testing.B) {
+			q, _ := New()
+			pattern := fmt.Sprintf(`{"val": [{"shellstyle": %q}]}`, sp.shellstyle)


Not a blocker but let's use wildcard rather than shellstyle in the future.

Sorry to be an epic nitpicker, but there is a problem here. "shellstyle" is the description of the overall pattern, while "wildcard" is the specific production. I don't care how this distinction is resolved, but it's there.

timbray · 2026-05-30T03:38:36Z

On May 29, 2026 at 6:26:02 PM, RS ***@***.***> wrote: Sorry to be an epic nitpicker, but there is a problem here. "shellstyle" is the description of the overall pattern, while "wildcard" is the specific production. I don't care how this distinction is resolved, but it's there.

I yield to no-one in the pickiness of my nits. What happened was, I implemented shellstyle but stupidly forgot to put in escaping for *, but then couldn’t change the API, so wildcard is just shellstyle with escaping for * and then of course \. -T —

…

Reply to this email directly, view it on GitHub <#527?email_source=notifications&email_token=AAAEJE4Z622EDBR6OD33OGT45I2CVA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMZZGM4TMOJSGQ3KM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#discussion_r3327755226>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAEJE4TBG74P5YOTKM32V345I2CVAVCNFSM6AAAAACZDCNCFSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHM2DGOJTHE3DSMRUGY> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

sayrer · 2026-05-30T05:08:58Z

I am surprised by your strong opinion here, although I do not disagree with it. I'll try to knock out something to cover this issue tomorrow morning. Happy to do it your way, it's just confusing.

sayrer and others added 5 commits May 18, 2026 11:11

Add a benchmark targeting NFA DFA tradeoffs.

9a468fd

Fix comment.

8dba44d

Fix comment.

de515ee

sayrer mentioned this pull request May 18, 2026

NFA => DFA optimization #481

Open

Merge branch 'main' into regex-nfa-dfa-benchmarks

acc18e4

timbray approved these changes May 30, 2026

View reviewed changes

timbray merged commit 567d4d5 into timbray:main May 30, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regex nfa dfa benchmarks#527

Regex nfa dfa benchmarks#527
timbray merged 6 commits into
timbray:mainfrom
sayrer:regex-nfa-dfa-benchmarks

sayrer commented May 18, 2026

Uh oh!

timbray May 29, 2026

Uh oh!

sayrer May 30, 2026

Uh oh!

Uh oh!

timbray commented May 30, 2026 via email

Uh oh!

sayrer commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sayrer commented May 18, 2026

Uh oh!

timbray May 29, 2026

Choose a reason for hiding this comment

Uh oh!

sayrer May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

timbray commented May 30, 2026 via email

Uh oh!

sayrer commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants