Skip to content

perf: right-size register tuple allocations#153

Merged
davydog187 merged 1 commit intomainfrom
perf/right-size-register-tuples
Feb 27, 2026
Merged

perf: right-size register tuple allocations#153
davydog187 merged 1 commit intomainfrom
perf/right-size-register-tuples

Conversation

@davydog187
Copy link
Contributor

Summary

  • Replace the hard-coded 256-register top-level allocation in vm.ex with proto.max_registers + 16
  • Replace the +64 over-allocation on every callee call in executor.ex with +16

This cuts register tuple waste significantly for recursive workloads. For fib(30) (~30 recursive calls), this saves ~48 wasted nil slots per call frame (75% reduction from +64 to +16).

The +16 buffer (vs the ideal +4) is necessary because the codegen's max_registers doesn't always track multi-return expansion result slots — call results can land beyond the stated max register. The vararg Lua 5.3 suite test confirmed +4 was insufficient.

Test plan

  • All 1273 existing tests pass with 0 failures

🤖 Generated with Claude Code

Replace the hard-coded 256-register top-level allocation and the
+64 over-allocation on every callee call with values derived from
the prototype's max_registers field.

- vm.ex: proto.max_registers + 16 (was 256)
- executor.ex: max(max_registers, param_count) + 16 (was + 64)

The +16 buffer accounts for multi-return expansion slots that the
codegen doesn't always track in max_registers. This cuts register
tuple waste significantly for recursive workloads like fib(N).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@davydog187 davydog187 merged commit 7269a91 into main Feb 27, 2026
2 checks passed
@davydog187 davydog187 deleted the perf/right-size-register-tuples branch February 27, 2026 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant