You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve line break segmentation conformance and compatibility with ICU.
Enhancements
Replaces the regex-based segmentation engine with a single-pass DFA evaluator. Sentence break on a 4 KB unbroken sentence drops from ~9,200 ms to ~11 ms (~840×); word break on a 4 KB sentence from ~7,000 ms to ~12 ms (~580×); scaling is now linear in input length instead of O(N²).