Skip to content

feat(decompiler): Rec 31 #31-3 RAII Stage 2B — xml_parse global_scan unique_ptr#52

Closed
CryptoJones wants to merge 1 commit into
feat/rec-31-raii-stage2b-xml-lvaluefrom
feat/rec-31-raii-stage2b-xml-globalscan
Closed

feat(decompiler): Rec 31 #31-3 RAII Stage 2B — xml_parse global_scan unique_ptr#52
CryptoJones wants to merge 1 commit into
feat/rec-31-raii-stage2b-xml-lvaluefrom
feat/rec-31-raii-stage2b-xml-globalscan

Conversation

@CryptoJones
Copy link
Copy Markdown
Owner

Stacked on #51 — base is feat/rec-31-raii-stage2b-xml-lvalue, not master. GitHub will rebase to master automatically when #51 lands. The diff visible here is only the global_scan change; the lvalue changes are part of #51's scope.

Converts xml_parse()'s pairing of global_scan = new XmlScan(i) / delete global_scan; to a scope-bound unique_ptr that holds ownership across the yyparse call:

auto scan = make_unique<XmlScan>(i);
global_scan = scan.get();    // raw observer pointer for grammar actions
...
int4 res = yyparse();
...
global_scan = (XmlScan *)0;  // null observer before scan goes out of scope
return res;

global_scan stays a static XmlScan * (raw observer) because it's accessed from yyparse / yylex / grammar semantic actions — those see it through the symbol declared at file scope, and changing it to unique_ptr would require changing every accessor. The observer pattern keeps the surface area minimal: only xml_parse owns; everyone else just borrows during the parse.

The explicit global_scan = nullptr before return is defensive — if anything ever tries to dereference global_scan after xml_parse exits, crash on null is far better than use-after-free on dangling.

After this PR, the only raw news remaining in the xml.y / xml.cc epilogue are:

  • line 538 new Element(cur) — parse-tree node allocation, owned by parent's children list; needs Element class field refactor.
  • line 624 new Document() — returned across xml_tree() API to XmlDecode; needs xml.hh + XmlDecode coordination.

Both are separate PRs. The bison semantic-action raw news (xml.y:150, 153, 198, 200, 208) remain blocked on %union redesign.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

…unique_ptr

Stacked PR on top of the XmlScan::lvalue migration (PR #51). Same
hand-edit pattern: epilogue change in xml.y + matching parallel edit
in xml.cc, no bison regeneration required.

Converts xml_parse()'s pairing of `global_scan = new XmlScan(i)` /
`delete global_scan;` to a scope-bound unique_ptr that holds ownership
across the yyparse call:

  auto scan = make_unique<XmlScan>(i);
  global_scan = scan.get();  // raw observer pointer for grammar actions
  ...
  int4 res = yyparse();
  ...
  global_scan = (XmlScan *)0;  // null observer before scan goes out of scope
  return res;

`global_scan` stays a `static XmlScan *` (raw observer) because it's
accessed from yyparse / yylex / grammar semantic actions — those see
it through the symbol declared at file scope, and changing it to
unique_ptr would require changing every accessor. The observer pattern
keeps the surface area minimal: only xml_parse owns; everyone else
just borrows during the parse.

The explicit `global_scan = nullptr` before return is defensive — if
anything ever tries to dereference global_scan after xml_parse exits,
crash on null is far better than use-after-free on dangling.

After this PR, the only raw `new`s remaining in the xml.y / xml.cc
epilogue are:

  - line 538 `new Element(cur)` — parse-tree node allocation, owned
    by parent's children list; needs Element class field refactor.
  - line 624 `new Document()` — returned across xml_tree() API to
    XmlDecode; needs xml.hh + XmlDecode coordination.

Both are separate PRs. The bison semantic-action raw `new`s
(xml.y:150, 153, 198, 200, 208) remain blocked on `%union` redesign.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CryptoJones added a commit that referenced this pull request May 26, 2026
…ed + std::span deviation (#54)

The Sprint 6 row for Rec 31 #31-3 (RAII Stage 2) + Rec 32 #32-4
(std::span adoption) was a single open checkbox that no longer
matched reality:

  - marshal.cc RAII landed via PR #46 (Stage 2A).
  - marshal std::span did NOT land — deviation from the documented
    "same files, same PR" plan in docs/decompiler/CPP20_ADOPTION.md.
    marshal's public API is [start, end) pointer-pair ranges, not the
    (T*, size_t) shape that std::span naturally replaces; whether
    there's a natural std::span site here at all is now an explicit
    open question rather than implicit slippage.
  - xml.y / xml.cc RAII is in flight as a multi-PR thread (PRs #51
    + #52 stacked); bigger pieces still pending.
  - xml std::span is open and best audited alongside the bison
    %union redesign needed for the semantic-action sites.

Replaced the single open checkbox with four bullets recording each
piece's current status. Surfaced during the 2026-05-26 self-audit.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CryptoJones added a commit that referenced this pull request May 26, 2026
…gration (#71)

Closes audit finding #8 from the 2026-05-26 self-audit by replacing
the "Stage 2C in flight" hand-wave with an honest scoping doc. The
original assumption was "regenerate xml.cc with bison-3.0.4 to apply
RAII to the semantic actions." Investigation showed:

  - bison-3.0.4 doesn't build on modern glibc (gnulib fseterr.c
    portability bug). bison-3.0.5 builds cleanly and produces an
    ~615-line mostly-cosmetic diff against the in-tree xml.cc.
  - The real blocker is not the bison version. The %union holds raw
    pointers (string*, Attributes*, NameValue*) and a C-style union
    can't hold non-trivially-destructible types like unique_ptr.
  - Migrating to RAII therefore requires switching xml.y to bison's
    C++ variant mode (`%define api.value.type variant`), which makes
    yyparse a method of an xml::parser class, changes the yylex
    contract, and wholesale-rewrites the generated xml.cc.

That's a strategic sprint, not a multi-PR thread.

This doc lays out:
  - what's left after Stage 2B (the seven specific raw-new sites
    and their exact line numbers in xml.y and xml.cc);
  - why hand-edit-parallel (the 2B technique) doesn't extend to
    these sites (they cross the %union boundary);
  - two architectural options:
      A. switch to `%define api.value.type variant` — clean RAII,
         wholesale rewrite of xml.cc, full xml_parse shim needed;
      B. keep %union of raw pointers, treat the five
         bison-value-stack sites as a documented exception in
         cppRaiiAudit, and clean up only the one obvious code-smell
         (a heap-allocated stack-scoped temporary in xml.y:208);
  - recommendation: B for the next PR (small, shipping-ready),
    A for a future strategic sprint;
  - a four-step plan ordering Stage 2C-min, the Element parse-tree
    ownership migration, the Document return-value migration, and
    the eventual Option A sprint.

Updates docs/decompiler/RAII_MIGRATION.md's #31-3 row to point at
the new design doc and to reflect what's shipped (#46 marshal) vs.
in-flight (#51, #52 xml epilogue) vs. scoped (the new doc).

No code change in this PR — design / scoping only.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CryptoJones added a commit that referenced this pull request May 26, 2026
…n work (#72)

CHANGELOG.md's [Unreleased] section wasn't updated as PRs landed
throughout the day. Adding a single dated section (2026-05-26)
that records the 28 PRs merged this session, grouped by Rec:

  - Rec 28: ignoreAudit Stage 2 strict, 17 author-declared-not-a-
    regression-test deletions, tracking-issue re-file, inventory
    honesty refresh.
  - Rec 31: cppRaiiAudit per-file gate (Stage 1), marshal RAII
    Stage 2A, Stage 2C design doc.
  - Rec 13/14: OSS-Fuzz primary_contact fill-in + in-tree/upstream
    sync + upstream PR (google/oss-fuzz#15545) submitted.
  - CI / housekeeping: sync-labels live mode, 26-branch sweep.
  - Doc sync: SprintPlanning marshalshipped + std::span deviation.

Also noted the three in-flight PRs (#50/#51/#52) that landed-as-CI
but didn't merge yet, so they appear as "queued" rather than as
shipped work.

Also fixes a stale "Work toward v26.1.10" header — v26.1.10
already shipped (per the Released section); [Unreleased] is now
toward v26.1.11.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@CryptoJones CryptoJones deleted the branch feat/rec-31-raii-stage2b-xml-lvalue May 26, 2026 08:15
@CryptoJones CryptoJones deleted the feat/rec-31-raii-stage2b-xml-globalscan branch May 26, 2026 08:17
CryptoJones added a commit that referenced this pull request May 26, 2026
PR #51's branch was inadvertently based on the in-flight PR #50
CodeQL-fix branch (not master), so PR #51's squash-merge included
PR #50's broken first commit alongside the intended lvalue RAII
change. Master at f41d8fc ended up with the broken CodeQL
config; PR #50 couldn't merge as-is; PR #52 (xml global_scan,
stacked on PR #51) also auto-closed when its base disappeared.

Mitigation:
  - PR #74 cherry-picks PR #50's second commit (binutils-dev
    fix) onto current master cleanly.
  - PR #73 cherry-picks PR #52's global_scan commit onto current
    master cleanly.

Adds an entry to Apologies.md at the top (per the log policy)
recording cause + downstream damage + mitigation.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CryptoJones added a commit that referenced this pull request May 26, 2026
…leased] (#76)

The catch-up changelog PR (#72) listed #50, #51, #52 as in-flight.
Now resolved:

  - #51 (lvalue) — merged (was the lone in-flight item that landed
    cleanly).
  - #50 (CodeQL fix) — superseded by #74 after the stacking mistake.
    #74 landed and Analyze (c-cpp) now passes on master.
  - #52 (global_scan) — superseded by #73 after the same stacking
    mistake. #73 landed.
  - #75 (Apologies) — landed alongside, recording the chain.

Removes the "in flight" footnote and replaces with a paragraph
explaining the chain of events so readers understand why #50 / #52
are absent from the merged ledger and #73 / #74 are present
covering the same scope.

Aaron's per-PR changelog feedback (feedback_changelog_per_pr.md)
applied: this PR ships its own changelog touch alongside the actual
state change, not as a catch-up.

Proudly Made in Nebraska. Go Big Red! 🌽 https://xkcd.com/2347/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant