Skip to content

Add unary negation, logical/bitwise not, and bitwise/shift operators#188

Open
WalterGropius wants to merge 5 commits into
vercel-labs:mainfrom
WalterGropius:feat/unary-bitwise-ops
Open

Add unary negation, logical/bitwise not, and bitwise/shift operators#188
WalterGropius wants to merge 5 commits into
vercel-labs:mainfrom
WalterGropius:feat/unary-bitwise-ops

Conversation

@WalterGropius
Copy link
Copy Markdown

Summary

Adds parsing, type-checking, IR lowering, and emitter coverage for the unary
operators -, !, ~ and the binary operators |, ^, <<, >> (plus
binary & precedence). Previously these all failed at parse time with
PAR100.

Source

Addresses B4 + P1-6. Reproduced against main @ 999a46b.

Before this change every form below failed at PAR100:

  • -1, -x, -1.0
  • !p
  • x | y, x ^ y, ~x
  • x << 1, x >> 1

Multiplicative *, /, % already worked, and binary &, &&, || already
parsed via existing precedence, so no arithmetic work was needed there.

Changes

  • Lexer: emit |, ^, ~, << as new tokens. >> is intentionally
    kept as two > tokens so generic terminators like Foo<Bar<T>> still
    close; the expression parser folds two adjacent (no-whitespace) > into
    a >> shift operator.
  • Parser: new EXPR_UNARY node for prefix -, !, ~. Binary
    precedence (high to low): unary > * / % > + - +% +| > << >> > & >
    ^ > | > cmp > && > ||.
  • Checker: bitwise operators are integer-only with matching types;
    shifts require an integer left operand and an unsigned shift amount;
    constant shift amounts at or beyond the operand bit width are rejected
    as TYP002. Unary - rejects unsigned types; ! requires Bool; ~
    requires an integer. The meta evaluator constant-folds all of the new
    operators.
  • IR: extends IrBinaryOp with BITAND/BITOR/BITXOR/SHL/SHR and adds
    IR_VALUE_UNARY with NEG/NOT/BITNOT. Right-shift selects logical (LSR)
    for unsigned and arithmetic (ASR/SAR) for signed operands. MIR verifier
    enforces operand and result types for the new opcodes.
  • Emitters:
    • emit_macho64.c (AArch64 Mach-O): real lowering using AND/ORR/EOR,
      LSLV/LSRV/ASRV, NEG (SUB with XZR), MVN, and a CBZ-based bool not.
    • emit_elf64.c (x86_64 ELF) and emit_coff.c (x86_64 COFF): real
      lowering using AND/OR/XOR (21/09/31 c8), SHL/SHR/SAR via cl (D3 E0/E8/F8), NEG/NOT (F7 D8/D0), and test+sete+movzx for bool
      not.
    • emit_elf_aarch64.c (AArch64 ELF, MVP literal-return backend):
      rather than silently dropping new opcodes, walks the IR for each
      function and reports an explicit CGEN004 if any new opcode appears.

Conformance

Adds three passing fixtures and one failing fixture, wired into
conformance/run.mjs:

  • conformance/native/pass/ops-unary-neg.0 – unary negation across types,
    including nested negation and -(a + b).
  • conformance/native/pass/ops-not.0!, !!, and !(...) over Bool
    expressions including short-circuit interactions.
  • conformance/native/pass/ops-bitwise.0&, |, ^, <<, >>, ~,
    precedence checks, and a UTF-16 surrogate-pair (0xD83D / 0xDE00)
    decoded to its codepoint U+1F600 and re-encoded to the UTF-8 bytes
    F0 9F 98 80 using only the new shift/mask operators.
  • conformance/native/fail/shift-overflow.01 << 32 against an i32
    is rejected at check time with TYP002.

Local gates pass on darwin-arm64:

  • ZERO_NATIVE_TEST_ALLOW_LOCAL=1 node conformance/run.mjsconformance ok
  • node --experimental-strip-types scripts/snapshot-command-contracts.mts
    command contract snapshots ok

The linux-musl-x64 cross step in native:test:local continues to fail on
this host because no cross compiler / zig is installed; this is
environmental and not a regression introduced here.

Proposed CHANGELOG line

  • Compiler: parse and lower unary -, !, ~ and binary |, ^, <<,
    >> for integer expressions; bitwise/shift operands are integer-only,
    shifts validate the shift amount against the operand bit width, and
    >> selects logical or arithmetic shift based on signedness.

WalterGropius and others added 5 commits May 21, 2026 23:29
Adds lexer tokens for |, ^, ~, <<, >>; introduces EXPR_UNARY and parses
unary prefix operators (-, !, ~) above multiplicative; adds binary
precedence for bitwise &, |, ^ and shifts << >> between cmp and additive.

Checker validates integer-only bitwise/shift operands, signed-only unary
negation, integer/bool unary type rules, and rejects shift amounts that
exceed the left operand's bit width. Meta evaluator constant-folds the
new operators. MIR verifier accepts the new IR opcodes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds AArch64 instruction encodings for ORR, EOR, MVN, NEG (via SUB with
XZR), LSLV, LSRV, ASRV, plus AND was reused. The Mach-O backend now
lowers IR_VALUE_UNARY (negation, logical not, bitwise not) and the new
binary opcodes IR_BIN_BITAND/BITOR/BITXOR/SHL/SHR. Right shift selects
LSR for unsigned and ASR for signed operands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
x86_64 ELF and Windows COFF backends gain encodings for AND/OR/XOR
(0x21/0x09/0x31), SHL/SHR (D3 /4-/7), NEG/NOT (F7 /3/2), and Bool not
via test+sete+movzx.

The AArch64 ELF MVP backend only lowers single-literal returns. To
avoid silently dropping the new opcodes, it now walks each function's
IR and emits an explicit CGEN004 if it encounters IR_VALUE_UNARY or
the new bitwise/shift IR_VALUE_BINARY ops, instead of producing a
wrong exit-0 binary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lexer no longer folds `>>` into a single token so existing generic
syntax like `Foo<Bar<T>>` keeps closing cleanly. The expression parser
now recognises two adjacent `>` tokens (no whitespace between them) as
the right-shift operator at the new shift precedence.

Adds passing fixtures for unary negation, logical not, and bitwise
operators (with precedence checks and a UTF-16 surrogate-pair to UTF-8
encode), plus a failing fixture that exercises the shift-amount
bit-width check via TYP002. Wires them into the conformance runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an Operators section listing the binary precedence tiers from || down to
multiplicative, the unary prefix operators (-, !, ~), and the integer-only
constraints for bitwise and shift operators. Notes the special handling of
two adjacent `>` tokens as `>>` so `Foo<Bar<T>>` continues to parse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

@WalterGropius is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

WalterGropius added a commit to WalterGropius/zero that referenced this pull request May 22, 2026
This is the B8 narrow slice on top of the existing P0-1 ABI classification
scaffolding: a function declared `fun foo(...) -> f32 { return 1.5 }` (and
the f64 analogue) now builds and runs on the linux-musl-x64 host-runnable
target instead of hitting `CGEN004: direct backend return type is
unsupported  actual: f32`.

Per-backend ABI verdict:

- elf64 (System V AMD64): real lowering.  Float literals materialise into
  xmm0 via `mov eax, imm32 + movd xmm0, eax` (f32) or `movabs rax, imm64 +
  movq xmm0, rax` (f64), avoiding the rodata machinery for scalar floats.
  Function returns leave xmm0 as the SysV return register.  Equality
  comparisons (`==` / `!=`) use `ucomis{s,d} xmm1, xmm0` and combine ZF/PF
  for IEEE 754 semantics — `eq` is `ordered AND ZF=1`, `ne` is the
  complement.  Ordered float comparisons (`<` `<=` `>` `>=`) are explicitly
  rejected with CGEN004 because the narrow scope does not need them and
  unordered handling would otherwise be silently wrong.
- macho64 (Darwin AArch64): keeps the existing CGEN004 path; AAPCS64 s0/d0
  lowering is the named follow-up.
- coff (Windows x64): keeps the existing CGEN004 path; the SysV-style
  xmm0 lowering would mostly carry over but is not enabled in this PR.
- elf_aarch64 (Linux AArch64): now explicitly rejects f32/f64 returns with
  CGEN004 in `a64_reject_unsupported_function`, which is also called from
  the literal-only emitter so the silent-drop "MVP subset" trap from
  PR vercel-labs#122 / PR vercel-labs#188 can not re-trigger for the new IR float types.

Float params and float locals stay rejected by the IR (`ir_type_is_direct_abi`
unchanged); a new `ir_type_is_direct_abi_return` opens the door only at the
return position.  Aggregate (Span/MutSpan/String/Maybe/shape) param+return
marshalling is also still out of scope — that's the next slice and the
classification annotation continues to surface in their CGEN004 messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WalterGropius added a commit to WalterGropius/zero that referenced this pull request May 22, 2026
Adds two host-runnable conformance fixtures whose `returnF32() == 1.5` /
`returnF64() == 2.25` round-trip exercises the SysV AMD64 xmm0 path on
linux-musl-x64.  On any host whose runnable direct target is not elf64
(darwin-arm64 macho64, in our case) the `assertCommonRuntimeOrUnsupported`
helper accepts the explicit CGEN004 path instead, mirroring the existing
runnable-target gating used by other native fixtures.

Adds three explicit-rejection assertions — one per still-deferred backend
(darwin-arm64 macho, win32-x64.exe coff, linux-arm64 elf) — so the "never
silent-drop" invariant from PR vercel-labs#122 / PR vercel-labs#188 is now pinned for the new
f32/f64 IR types as well.

Updates `docs/articles/target-capabilities.md` with a per-backend f32/f64
table and `zero explain CGEN004` with the corresponding scope note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant