Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize non native field mul for pseudo mersenne primes #164

Conversation

nicholas-mainardi
Copy link

This PR addresses issue #160 , introducing an optimization that allows to save from 30% to 40% constraints for the multiplication of non native field elements when the prime field is a pseudo-mersenne, i.e, the prime modulus p = 2^n - c for a small integer c.

This optimization allows to represent the result of a multiplication as a NonNativeFieldGagdet rather than a NonNativeFieldMulResultGadget, enabling a more efficient reduction. The private functions that deals with multiplication of field elements in a pseudo-mersenne field are added, and they are accordingly called by the publicly exposed multiplication functions in case the non native field is pseudo-mersenne. The definition of the prime field trait in algebra is modified to allow specifying the value of the small integer c for a pseudo-mersenne prime field.

This PR also fixes the estimation of the number of constraint for a multiplication performed in the find_parameters function, adding also the estimation in the case of a pseudo-mersenne field.

Copy link
Collaborator

@DanieleDiBenedetto DanieleDiBenedetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will return on that to check the arithmetic, from a SW perspective maybe it's worth to explicitly separate the functions (and maybe put the ones for pseudo mersenne in another module ?)

@@ -633,13 +641,13 @@ impl<SimulationF: PrimeField, ConstraintF: PrimeField>
// 2 * bits_per_limb + surfeit' <= CAPACITY - 2,
// ``
// with `surfeit' = log(num_limbs * (num_add(L) + 1) * (num_add(R) + 1) + num_limbs)`.
let params = get_params(SimulationF::size_in_bits(), ConstraintF::size_in_bits());
let num_add_bound = BigUint::from(params.num_limbs)
//ToDo: check this constraint, as by comparison with prereduce constraint a factor of 2 seems to be missing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UlrichHaboeck75 what do you think ?

Copy link

@UlrichHaboeck75 UlrichHaboeck75 Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm. The factor 2 is missing. The correct formula should be

   surfeit' = log(2 * num_limbs * (num_add(L) + 1) * (num_add(R) + 1) + num_limbs).

This factor is introduced by 2^len(p) / p <= 2 in the limb bound of the quotient k.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this TODO now no ?

if super::is_pseudo_mersenne::<SimulationF>() {
let res = self.mul_without_prereduce_for_pseudomersenne(cs.ns(|| "mul for pseudo-mersenne"), other, false)?;
// convert res to NonNativeFieldMulResultGadget to be compatible with type returned by the function
return Ok(NonNativeFieldMulResultGadget::from_non_native_field_gadget(&mut cs, &res, &params))
Copy link
Collaborator

@DanieleDiBenedetto DanieleDiBenedetto Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the advantages of this optimization based on Mersenne is that it allows us to return a NonNativeFieldGadget as a result of mul instead of a NonNativeFieldMulResultGadget (on which we can only perform additions); here, however, to keep the interfaces consistent, we are still returning a NonNativeFieldMulResultGadget, obliging us to perform a reduce if we want to do another mul while it shouldn't be needed.
In future we will refactor everything so NonNativeFieldMulResultGadget won't exist anymore but, for this PR, would be nice to test if we are wasting constraints and how much.
I suggest to perform a test that does

let non_native_field_mul_result_gadget= a.mul(b)
let non_native_field_gadget = non_native_field_mul_result_gadget.reduce()
non_native_field_gadget .mul(c)

Using the function we expose to the user and another one:

let non_native_field_gadget = a.mul_mersenne(b)
non_native_field_gadget.mul_mersenne(c)

Using instead our internal function for Mersenne mul.
Let's count number of constraints in both cases and compare

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suspected, the strategy we are using now seems to be suboptimal. However, optimizing in a cleaner way requires most likely a huge refactoring (already envisioned by #152 anyway). I would suggest to postpone it when addressing #152.

@@ -824,14 +992,16 @@ impl<SimulationF: PrimeField, ConstraintF: PrimeField>
/// ``
// Costs
// ``
// C = 3 * S + surfeit + num_limbs(p) + 1,
// C = 3 * S + surfeit + 2*num_limbs(p) + 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it wrong before ? Why ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused now, too. We implement the improvement using mul_by_constant(), which saves num_limbs(p) constraints. But we find an error in the previous computation, which neglects the number of constraints in of the conditional select, again num_limbs(p) many. Shouldn't we now end up with the old formula

     C = 3 * S + surfeit + num_limbs(p) + 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right, I forgot to subtract num_limbs from the comment when I change the function to use mul_by_constant. Actually I realized that I have not yet updated the comment with the correct formulas I derived while fixing the find_parameters function, so I will update this comment with the correct estimated cost.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

limbs.extend_from_slice(&input_gagdet.limbs[..]);
// set num_add to ceil(input_gadget.num_add/2^bits_per_limb) as a limb of
// NonNativeFieldMulResultGadget has 2*bits_per_limb for each limb
let num_add_over_normal_form = &input_gagdet.num_of_additions_over_normal_form/BigUint::from(2usize).pow(params.bits_per_limb as u32) + BigUint::from(if &input_gagdet.num_of_additions_over_normal_form % BigUint::from(2usize).pow(params.bits_per_limb as u32) != BigUint::from(0usize) { 1usize } else {0usize});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simplify the ceiling in this way:

ceil(a/b) = (a + b - 1)/b

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be made a bit more precise:
If n is the number of additions over normal form of the initial gadget "x", then we have that
x < (n+1) * (2^bpl - 1)
which is equivalent to
x < { (n+1) / (2^bpl - 1) } * (2^bpl - 1)^2
It follows, called n' the new number of additions over normal form, n' = ceil ((n+1) / (2^bpl - 1) - 1)
In practice there will be no difference, since this will be always 0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right about the n+1, I forgot it, thanks. But it is not clear why we should consider 2^bpl-1 as an upper bound for the limb rather than 2^bpl.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the validity of the implemented formula. In terms of the limb bound factors num_adds + 1, shouldn't we have

   (num_adds + 1) = (num_adds(product) + 1) *  2^bits_per_limb,

and hence num_adds(product) = Floor[(num_adds + 1)/2^bits_per_limb] - 1?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right about the n+1, I forgot it, thanks. But it is not clear why we should consider 2^bpl-1 as an upper bound for the limb rather than 2^bpl.

just because 2^bpl - 1 = (1...1)_2, but it does not make a big difference (in this case no difference in practice).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the validity of the implemented formula. In terms of the limb bound factors num_adds + 1, shouldn't we have

   (num_adds + 1) = (num_adds(product) + 1) *  2^bits_per_limb,

and hence num_adds(product) = Floor[(num_adds + 1)/2^bits_per_limb] - 1?

num_adds(product) = Floor[(num_adds + 1)/2^bits_per_limb] - 1

This could be a negative number (in practice it will be -1), we need to put ceil instead of floor.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still use 2^bpl since, at least for a non-native field gagdet that is the result of a multiplication, n+1 should be divisible by 2^bpl, hence avoiding rounding.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still use 2^bpl since, at least for a non-native field gagdet that is the result of a multiplication, n+1 should be divisible by 2^bpl, hence avoiding rounding.

It depends if you change the conversion from NonNativeFieldMulResultGadget -> NonNativeFieldGadget
according to my comment below. In that case (n+1) will be a multiple o (2^bpl - 1).

Copy link
Author

@nicholas-mainardi nicholas-mainardi Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm not sure, consider that n+1 is computed in mul_without_prereduce_for_pseudomersenne to be multiple of 2^bpl, it is not initialized from a NonNativeFieldMulResultGadget with the function you referred to.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formula fixed in the code employing 2^bpl

let num_adds_prod: u128 = (num_limbs as u128*num_adds_plus_one*num_adds_plus_one)*(pseudo_mersenne_const as u128*2u128.pow(h as u32)+1)*2u128.pow(bits_per_limb as u32) - 1;
ceil_log2(num_adds_prod+3)+1
} else {
// otherwise, we compute the log without explicitly computing num_adds_prod:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a TODO about possibility to find some library supporting types bigger than 2^128 with const operation avoiding this additional complexity, or perhaps another solution at all @DDT92 @UlrichHaboeck75

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially because this works only given our hardcoded choose of SURFEIT

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a TODO about possibility to find some library supporting types bigger than 2^128 with const operation avoiding this additional complexity, or perhaps another solution at all @DDT92 @UlrichHaboeck75

See my comment above

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution I propose here is the refactoring as described in #152, the different treatment of parameters (and hence the optimum finder).

if super::is_pseudo_mersenne::<SimulationF>() {
let res = self.mul_without_prereduce_for_pseudomersenne(cs.ns(|| "mul for pseudo-mersenne"), other, false)?;
// convert res to NonNativeFieldMulResultGadget to be compatible with type returned by the function
return Ok(NonNativeFieldMulResultGadget::from_non_native_field_gadget(&mut cs, &res, &params))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suspected, the strategy we are using now seems to be suboptimal. However, optimizing in a cleaner way requires most likely a huge refactoring (already envisioned by #152 anyway). I would suggest to postpone it when addressing #152.

Copy link
Collaborator

@DanieleDiBenedetto DanieleDiBenedetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, but overall fine for me

@@ -633,13 +641,13 @@ impl<SimulationF: PrimeField, ConstraintF: PrimeField>
// 2 * bits_per_limb + surfeit' <= CAPACITY - 2,
// ``
// with `surfeit' = log(num_limbs * (num_add(L) + 1) * (num_add(R) + 1) + num_limbs)`.
let params = get_params(SimulationF::size_in_bits(), ConstraintF::size_in_bits());
let num_add_bound = BigUint::from(params.num_limbs)
//ToDo: check this constraint, as by comparison with prereduce constraint a factor of 2 seems to be missing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this TODO now no ?

@nicholas-mainardi nicholas-mainardi marked this pull request as ready for review February 1, 2022 11:31
@albertog78 albertog78 removed their request for review February 3, 2022 15:11
}

// TODO: This is as the default implementation. I have put it here
// as we can implement an improved variant, which does not reduce
// twice.
// twice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the todo is still pending for non pseudo-mersenne fields, no?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Then let us clarify here?

Copy link

@UlrichHaboeck75 UlrichHaboeck75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the records: Let us add a reminder to overthink the treatment of the generic/special prime discrimination to issue #152. I wouldn't be surprised that we can do better once we drop mulresult gadgets.

@DanieleDiBenedetto DanieleDiBenedetto merged commit 775fe6d into nonnative_doc Feb 10, 2022
@DanieleDiBenedetto DanieleDiBenedetto deleted the Optimize-non-native-field-mul-for-pseudo-mersenne-primes branch March 23, 2023 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimization of the non native arithmetics for pseudo-Mersenne primes.
4 participants