Conversation
6a87007 to
6aadd8d
Compare
- Replace Fermat's little theorem with extended Euclidean modular inverse, 2-3x faster for 256-bit operands - Fixed-base windowed scalar multiplication (2^4-ary method) with precomputed generator table, cuts sign time substantially - Skip A*pz^4 term in jacobian_double for secp256k1 (A=0); use 3*(px-pz^2)*(px+pz^2) shortcut for prime256v1 (A=-3) - Cache curve.nBitLength to avoid recomputing per call Benchmark (100 rounds): sign: 1.2ms -> 0.6ms verify: 0.9ms -> 0.9ms
When qz == 1 (affine input), skip computing qz*qz and simplify U1, S1, and nz, saving four field multiplications per add. This is the hot-path optimization used by multiplyGenerator, which feeds only affine operands.
Swap the 4-bit window table for an affine [G, 2G, 4G, ..., 2^nBitLength*G] table plus a bit-by-bit width-2 NAF loop. Every non-zero NAF digit triggers one mixed add and zero doublings, cutting the ~256 doublings of the windowed method down to ~86 adds for 256-bit scalars. Table still cached in :persistent_term keyed by curve name.
Replace raw-binary Shamir for n1*p1 + n2*p2 with JSF. JSF picks
signed digits in {-1, 0, 1} so at most ~l/2 digit pairs are
non-zero, versus ~3l/4 for raw binary, cutting the expected
number of adds in the simultaneous double-and-add loop by
roughly a third. Used only with public scalars (verification).
Split each 256-bit scalar k into two ~128-bit scalars (k1, k2) with k = k1 + k2*lambda (mod N) via Babai rounding against the reduced basis, then run a 4-scalar simultaneous multi-exponentiation over (p1, phi(p1), p2, phi(p2)) with a 16-entry table of subset sums. Halves the loop length versus the plain Shamir path. GLV constants live on the curve struct under :glvParams; curves without endomorphism (prime256v1) transparently fall back to Shamir+JSF.
6aadd8d to
69d3ff2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
fromJacobianinfinity guard[G, 2G, 4G, ..., 2^n*G]table (zero doublings during signing), Shamir's trick with Joint Sparse Form, GLV endomorphism for secp256k1 (splits each 256-bit scalar into two ~128-bit halves for a 4-scalar simultaneous multi-exponentiation during verification)Test plan
mix test)