fix(scanner): avoid GCC -O2 strict-aliasing miscompile in scan_start_tag#2
Open
swaits wants to merge 1 commit into
Open
fix(scanner): avoid GCC -O2 strict-aliasing miscompile in scan_start_tag#2swaits wants to merge 1 commit into
swaits wants to merge 1 commit into
Conversation
Parsing any document containing an HTML tag (even `<div></div>`) segfaults when the scanner is compiled with GCC at -O2 — the default for `tree-sitter build` and for the parsers editors such as Neovim ship. Debug (-O0) builds are unaffected, so test suites miss it. scan_start_tag() appends to the tag stack with `array_push(&state->html->tags, tag)`. array_push() (tree-sitter array.h) reallocates the backing buffer through the generic `Array *` type, then writes the new element back through the concrete `Array(Tag) *` type. Because the array is reached through the `state->html` pointer, GCC's -O2 strict-aliasing analysis treats the realloc's `contents` store and the element store as non-aliasing, elides the reload of `contents`, and writes the element through a stale NULL pointer. Do the grow + append explicitly, keeping every access on the concrete `Array(Tag)` type so there is no `Array *` type-pun for the optimizer to break. Applied to the canonical tree-sitter-htmlx scanner and its vendored copy in tree-sitter-svelte. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context: This segfaults for me when loading in Nvim v0.12.2. I discovered it while trying to incorporate this into Arborist.nvim, for arborist-ts/arborist.nvim#19. I tossed it at Claude to debug the crash and, despite my protests, this is what it comes up with. Seems far fetched to me, but at the same time, the fix does resolve the segfault for me. /shrug
Symptom
Parsing any document that contains an HTML tag — even
<div></div>— segfaults when the scanner is compiled with GCC at-O2.-O2is the optimization leveltree-sitter builduses and the level the parsers shipped to editors are built at, so in practice every real.sveltefile hard-crashes the editor. (Found via Neovim: opening any.sveltefile → SIGSEGV.)Debug (
-O0) builds are unaffected, which is why test suites don't catch it.Minimal reproduction
mwe.c— parse<div></div>:Build it against
parser.c+scanner.c+ the tree-sitter runtime, varying only howscanner.cis compiled:Results — same
parser.c, same runtime, same input, onlyscanner.c's compile flags change:scanner.ccompiled with<div></div>gcc -O2gcc -O0(document (element (start_tag …) (end_tag …)))gcc -O2 -fno-strict-aliasing(document (element (start_tag …) (end_tag …)))gcc -O2with this PR(document (element (start_tag …) (end_tag …)))What the table proves
-O2); the only difference is-fno-strict-aliasing. Adding exactly the flag that disables type-based alias analysis removes the crash ⇒ the UB is specifically a strict-aliasing violation. (This is mechanical — the flag named for the cause toggles the crash.)-O2build correct.Root cause
scan_start_tag()appends to the HTML scanner's tag stack:array_push()(tree-sitterarray.h) does two things:Array *type —_array__grow/_array__reservetakeArray *, whosecontentsmember isvoid *;Array(Tag) *type —(self)->contents[(self)->size++] = element, whosecontentsmember isTag *.Arrayand theArray(Tag)anonymous struct are not compatible types. At-O2, GCC's strict-aliasing analysis concludes that the realloc's store to…->contents(viaArray *) and the element store's load of…->contents(viaArray(Tag) *) cannot alias. It keeps the pre-realloccontentsvalue in a register and never reloads it after the grow — so the first tag of the document is written through the staleNULLpointer the array was created with.Confirmed at the instruction level from a core dump: on the no-grow path the build reloads
contents; on the grow path it does not, and writes the element through a register still holdingNULL.This is the only
array_pushin the scanner whose array is reached through a pointer field (state->html->tags) rather than a direct struct member — consistent with it being the one call site that miscompiles.Fix
Do the grow + append explicitly in
scan_start_tag(), keeping every access on the concreteArray(Tag)type so there is noArray *type-pun for the optimizer to reason around. Applied to the canonicaltree-sitter-htmlxscanner and to its committed vendored copy undertree-sitter-svelte/src/htmlx/(the filetree-sitter buildactually compiles for the Svelte grammar).With the fix,
<div></div>and full Svelte 5 files (<script lang="ts">,{#if},{#each}, event handlers) parse cleanly at-O2.Unrelated side note:
tree-sitter buildalso warns that the external scanner exports non-statichelper functions (html_create,htmlx_create,html_scanner_scan,scan, …). Those can collide at load time when an editor loads this grammar next to another that vendors the same HTML scanner — worth markingstaticin a follow-up.