Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

secp256k1: Reduce scalar base mult copies. #2898

Merged
merged 1 commit into from
Mar 18, 2022

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented Mar 9, 2022

This is rebased on #2888.

Profiling shows that around 7.5% of the time in scalar base multiplication is attributed to duffcopy. Upon further examination, this is the result of a combination of the range statement making copies of the bytes and the need to construct a Jacobian point from the individual field values stored in the in-memory byte points table.

This optimizes the function to avoid that as follows:

  • Perform the conversion to Jacobian once when the affine byte table is decompressed from the stored values
  • Make use of those Jacobian points directly
  • Use an indexed for loop instead of a range over the bytes
  • Perform the calculation using the result variable directly instead of via a local variable that is copied to the result

The following benchmark results show the speedup is in line with the expected gains per the profiling results:

name                     old time/op   new time/op    delta
------------------------------------------------------------------------------
ScalarBaseMultNonConst   24.1µs ±22%   22.5µs ± 2%   -6.97%  (p=0.000 n=98+96)

@davecgh davecgh added this to the 1.8.0 milestone Mar 9, 2022
@davecgh davecgh force-pushed the secp256k1_optimize_scalarbasemult branch from 2bbfb34 to 9fcf7d6 Compare March 10, 2022 17:05
@rstaudt2
Copy link
Member

The PR description says this is rebased on #2888, but it looks like it is just based off master currently.

@davecgh davecgh force-pushed the secp256k1_optimize_scalarbasemult branch from 9fcf7d6 to 4037827 Compare March 13, 2022 17:27
@davecgh
Copy link
Member Author

davecgh commented Mar 13, 2022

Ah, I guess I rebased it over master in the last round of updates. It's rebased over 2888 now as intended.

@davecgh davecgh force-pushed the secp256k1_optimize_scalarbasemult branch from 4037827 to 9f550ed Compare March 14, 2022 19:41
Profiling shows that around 7.5% of the time in scalar base
multiplication is attributed to duffcopy.  Upon further examination,
this is the result of a combination of the range statement making copies
of the bytes and the need to construct a Jacobian point from the
individual field values stored in the in-memory byte points table.

This optimizes the function to avoid that as follows:

- Perform the conversion to Jacobian once when the affine byte table is
  decompressed from the stored values
- Make use of those Jacobian points directly
- Use an indexed for loop instead of a range over the bytes
- Perform the calculation using the result variable directly instead of
  via a local variable that is copied to the result

The following benchmark results show the speedup is in line with the
expected gains per the profiling results:

name                     old time/op   new time/op    delta
------------------------------------------------------------------------------
ScalarBaseMultNonConst   24.1µs ±22%   22.5µs ± 2%   -6.97%  (p=0.000 n=98+96)
@davecgh davecgh force-pushed the secp256k1_optimize_scalarbasemult branch from 9f550ed to aae0128 Compare March 18, 2022 00:32
@davecgh davecgh merged commit aae0128 into decred:master Mar 18, 2022
@davecgh davecgh deleted the secp256k1_optimize_scalarbasemult branch March 18, 2022 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants