Skip to content

Commit

Permalink
secp256k1: Reduce scalar base mult copies.
Browse files Browse the repository at this point in the history
Profiling shows that around 7.5% of the time in scalar base
multiplication is attributed to duffcopy.  Upon further examination,
this is the result of a combination of the range statement making copies
of the bytes and the need to construct a Jacobian point from the
individual field values stored in the in-memory byte points table.

This optimizes the function to avoid that as follows:

- Perform the conversion to Jacobian once when the affine byte table is
  decompressed from the stored values
- Make use of those Jacobian points directly
- Use an indexed for loop instead of a range over the bytes
- Perform the calculation using the result variable directly instead of
  via a local variable that is copied to the result

The following benchmark results show the speedup is in line with the
expected gains per the profiling results:

name                     old time/op   new time/op    delta
------------------------------------------------------------------------------
ScalarBaseMultNonConst   24.1µs ±22%   22.5µs ± 2%   -6.97%  (p=0.000 n=98+96)
  • Loading branch information
davecgh committed Mar 10, 2022
1 parent 4a6438a commit 9fcf7d6
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 16 deletions.
19 changes: 8 additions & 11 deletions dcrec/secp256k1/curve.go
Original file line number Diff line number Diff line change
Expand Up @@ -891,23 +891,20 @@ func ScalarMultNonConst(k *ModNScalar, point, result *JacobianPoint) {
func ScalarBaseMultNonConst(k *ModNScalar, result *JacobianPoint) {
bytePoints := s256BytePoints()

// Point Q = ∞ (point at infinity).
var q JacobianPoint
// Start with the point at infinity.
result.X.Zero()
result.Y.Zero()
result.Z.Zero()

// bytePoints has all 256 byte points for each 8-bit window. The strategy
// is to add up the byte points. This is best understood by expressing k in
// base-256 which it already sort of is. Each "digit" in the 8-bit window
// can be looked up using bytePoints and added together.
var pt JacobianPoint
for i, byteVal := range k.Bytes() {
p := bytePoints[i][byteVal]
pt.X.Set(&p[0])
pt.Y.Set(&p[1])
pt.Z.SetInt(1)
AddNonConst(&q, &pt, &q)
kb := k.Bytes()
for i := 0; i < len(kb); i++ {
pt := &bytePoints[i][kb[i]]
AddNonConst(result, pt, result)
}

result.Set(&q)
}

// isOnCurve returns whether or not the affine point (x,y) is on the curve.
Expand Down
10 changes: 5 additions & 5 deletions dcrec/secp256k1/loadprecomputed.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ import (

// bytePointTable describes a table used to house pre-computed values for
// accelerating scalar base multiplication.
type bytePointTable [32][256][2]FieldVal
type bytePointTable [32][256]JacobianPoint

// compressedBytePointsFn is set to a real function by the code generation to
// return the compressed pre-computed values for accelerating scalar base
Expand Down Expand Up @@ -66,12 +66,12 @@ var s256BytePoints = func() func() *bytePointTable {
for byteNum := 0; byteNum < len(bytePoints); byteNum++ {
// All points in this window.
for i := 0; i < len(bytePoints[byteNum]); i++ {
px := &bytePoints[byteNum][i][0]
py := &bytePoints[byteNum][i][1]
px.SetByteSlice(serialized[offset:])
p := &bytePoints[byteNum][i]
p.X.SetByteSlice(serialized[offset:])
offset += 32
py.SetByteSlice(serialized[offset:])
p.Y.SetByteSlice(serialized[offset:])
offset += 32
p.Z.SetInt(1)
}
}
data = &bytePoints
Expand Down

0 comments on commit 9fcf7d6

Please sign in to comment.