Skip to content

v1: harden self-host codegen and string interpolation lowering#133

Merged
egecanakincioglu merged 4 commits into
mainfrom
fix/codegen-large-parser-field-store-v1
Jun 8, 2026
Merged

v1: harden self-host codegen and string interpolation lowering#133
egecanakincioglu merged 4 commits into
mainfrom
fix/codegen-large-parser-field-store-v1

Conversation

@egecanakincioglu

Copy link
Copy Markdown
Owner

Summary

Hardens SafeRegAlloc codegen against scratch-register corruption and fixes string interpolation integer detection. Full bootstrap chain reaches S4 with 0 unresolved labels.

Commits

  • 82c32fbfix(codegen): use LEA for ADD with immediate
    • ADD+IMM uses LEA instruction (displacement in bytes, no scratch register needed)
  • 5476604fix(codegen): use immediate encodings for CMP and SUB
    • CMP/SUB with immediate use cmpRI/subRI direct encodings
  • f17db1dfix(ir): detect integer IDENT and FIELD in isIntegerExpr
    • String interpolation ${kind} now routes integers through i64_to_str
    • Fixes raw integers passed as string pointers to strcat/strlen
  • 30b6491fix(ir): simplify generateI64ToStr
    • Reduced from pre-counting + write-loop to fixed-buffer approach
    • Fewer IR instructions, reduced SafeRegAlloc slot pressure

Validation

Test Result
S1 → S3 fresh PASS
S3 --help PASS
S3 hello compile+run PASS (OK)
S3 → S4 fresh PASS
Unresolved labels 0
S4 --help PASS
while/NOT regression PASS (OK)
S3 interpolation regression PASS (1, 42)
S4 binary size 642K (was 706K before LEA)

Known Remaining Blocker

S4 hello still crashes during parsing with strlen(0). The crash changed from strlen(1) to strlen(0) after f17db1d — confirming the raw integer→strcat bug is fixed. Remaining issue: SafeRegAlloc codegen corruption in Parser__eat string interpolation call chain (i64_to_str→strcat→strlen cascade, one value becomes NULL). This only manifests in large functions with complex control flow.

Rules

  • No binary patching
  • No temporary IRLower split
  • No old RegAlloc validation
  • v1 core correctness work

🤖 Generated with Claude Code

…r corruption

In SafeRegAlloc BINOP path, ADD with IMM_INT operand used
safeLoadVal(IMM_INT) → movRI(scratch, imm) → addRR(dr, scratch).
This required loading the immediate into a scratch register, which
could be corrupted in large functions (suspect: scratch register
aliasing or slot collision for the immediate value).

Fix: use LEA instruction directly — encodes displacement in the
instruction bytes, no scratch register needed for the immediate.
Also more compact (LEA is 4-8 bytes vs MOV+IMM64+ADD which is 14).

Reduces S4 binary from 706K to 646K.
…h register

Extend LEA pattern: CMP with IMM_INT operand now uses cmpRI (direct
immediate encoding) instead of safeLoadVal(IMM)→scratch→cmpRR.
SUB with IMM_INT uses subRI similarly.

Eliminates scratch register usage for immediate values in comparison
and subtraction, preventing potential corruption in large functions.

S4 binary: 642K (was 646K after LEA, was 706K originally).
…nterpolation

isIntegerExpr was missing IDENT and FIELD expression kinds,
causing integer variables (parameters, fields) in ${} string
interpolation to bypass __arimo_i64_to_str conversion.

Integer values were passed directly to __arimo_strcat as string
pointers, causing strlen(strcat_result) to crash on small integers
(e.g., kind=1 appearing as pointer 0x1).

Fix: check varClassOf for IDENT and inferClass for FIELD,
returning true for Integer and Boolean types.
Replace complex pre-counting + write-loop i64_to_str with simpler
fixed-buffer approach. Writes digits right-to-left into 32-byte
buffer, returns pointer to first digit. Eliminates pre-count loop
and reduces IR instruction count and frame slots.
@egecanakincioglu egecanakincioglu merged commit 60ab60b into main Jun 8, 2026
1 check failed
@egecanakincioglu egecanakincioglu deleted the fix/codegen-large-parser-field-store-v1 branch June 8, 2026 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant