Skip to content

feat(L-Z06): Booth-2 radix-4 4×4→8 multiplier — ~40% fewer cells, R-SI-1 clean#62

Open
gHashTag wants to merge 1 commit into
feat/tt-v7-powerfrom
feat/lane-l-z06-booth2
Open

feat(L-Z06): Booth-2 radix-4 4×4→8 multiplier — ~40% fewer cells, R-SI-1 clean#62
gHashTag wants to merge 1 commit into
feat/tt-v7-powerfrom
feat/lane-l-z06-booth2

Conversation

@gHashTag
Copy link
Copy Markdown
Owner

Lane L-Z06 — Booth-2 Shift-Add Multiplier

Summary

Implements a Radix-4 Modified Booth-2 unsigned 4-bit × 4-bit → 8-bit multiplier as gf16_mul_booth2. Uses 3 partial products (2 full Booth-encoded + 1 trivial unsigned correction) instead of 4, yielding ~33–40% fewer adder cells.

Files

File Description
src/gf16_mul_booth2.v Radix-4 Booth-2 multiplier, 4×4→8, pure Verilog-2005
test/tb_gf16_mul_booth2.v Exhaustive 256-pair testbench (16×16 all combos)
docs/Z06_BOOTH2_ANALYSIS.md Cell count vs gf16_mul, critical path comparison

Verification

iverilog -Wall -o /tmp/tb_booth2 test/tb_gf16_mul_booth2.v src/gf16_mul_booth2.v
vvp /tmp/tb_booth2
# Booth-2 exhaustive test: 256 PASS, 0 FAIL
# ALL 256 PAIRS PASSED — L-Z06 booth-2 VERIFIED

Compliance

  • R-SI-1: ✅ Zero * operators in synthesizable code (comments only)
  • Pure Verilog-2005: ✅ No logic, typedef, enum, '{...}
  • Accuracy: ✅ 100% exact match on all 256 input pairs
  • Cell budget: ~45–50 cells vs ~75 for gf16_mul (~33–40% reduction)
  • Critical path: ~5 gate levels (Booth decode → mux → negate → add)

TOPS/W Impact

Lane L-Z06 target: +10 TOPS/W via ~33% power reduction in multiplier core.

Algorithm

Modified Booth radix-4 with unsigned correction term:

b_ext = {b[3:0], 1'b0}   // 5-bit augmented multiplier
dig0 = b_ext[2:0]         // weight 1
dig1 = b_ext[4:2]         // weight 4
PP2  = b[3] ? {a, 4'b0} : 0   // correction for unsigned MSB

product = booth(dig0)*a + booth(dig1)*a*4 + PP2

PP2 is the key addition vs standard Booth-2: it compensates for the sign-extension artifact when encoding unsigned inputs ≥ 8 with radix-4 Booth.


Auto-generated by Lane L-Z06 agent. Base: feat/tt-v7-power. iverilog verified. R-SI-1 clean.

Lane L-Z06 booth-2 shift-add multiplier implementation.

- src/gf16_mul_booth2.v: Modified Booth radix-4 multiplier
  * 4-bit × 4-bit → 8-bit unsigned, 3 partial products
  * 2 full Booth-encoded PPs + 1 trivial unsigned correction (PP2)
  * R-SI-1 clean: zero * operators anywhere
  * Pure Verilog-2005, ~50 cells vs ~75 for gf16_mul (~33% reduction)
  * Critical path: ~5 gate levels
  * +10 TOPS/W from L-Z06 lane catalog

- test/tb_gf16_mul_booth2.v: Exhaustive 256-pair testbench
  * All 16×16 combinations verified, 256 PASS 0 FAIL
  * Reference shift-add (no *) for R-SI-1 consistency

- docs/Z06_BOOTH2_ANALYSIS.md: Cell count vs gf16_mul analysis,
  critical path comparison, algorithm documentation

iverilog verified: ALL 256 PAIRS PASSED
R-SI-1: CLEAN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant