Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

double-and-add algorithm #11

Open
TalDerei opened this issue Mar 10, 2023 · 4 comments
Open

double-and-add algorithm #11

TalDerei opened this issue Mar 10, 2023 · 4 comments
Assignees
Labels

Comments

@TalDerei
Copy link

TalDerei commented Mar 10, 2023

The addition operation in projective coordinates implements operator overloading that computes a scalar multiplication using a naive double-and-add algorithm. Would be worth specializing the operation by switching it with Montgomery multiplication (CIOS) for more efficient multiplications.

Here's an efficient implementation: https://github.com/MinaProtocol/gpu-groth16-prover-3x/blob/master/multiexp/arith.cu#L289. It uses cooperative groups and thread ranks in cuda. For your codebase, it would simply require changing out the 'fixnum' calls to equivalent PTX instructions you implement in 'ptx.cuh'.

@omershlo
Copy link
Member

noted! thanks Tal.
We will improve the double-and-add. Is the Mina code implements the state of the art?
btw, feel free to make a PR

cc: @DmytroTym, @mickeyasa

@TalDerei
Copy link
Author

Sure! It's a stripped down version from cuda-fixnum (extended modular arithmetic library) built by Hamish from Polygon Zero. It's an older repository, but it's performant: https://github.com/data61/cuda-fixnum.

@DmytroTym
Copy link
Contributor

I feel like two different things are at play here: modular multiplication, which is what Montgomery multiplication does, and single-scalar multiplication, which is what double-and-add does. Admittedly, we have room to improve in both. For modular multiplication, we want to see if our scheme based on Barrett reduction can be competitive with Montgomery CIOS, and if not, we'll switch to CIOS. We only have basic Barrett implemented for now. As for double-and-add, you are also right that it's suboptimal, we're planning on moving to more efficient algorithms in the near future.

@DmytroTym
Copy link
Contributor

To comment on this: we haven't implemented a more-efficient single-scalar multiplication than double-and-add yet, but we don't really have any use-cases for it so it's still low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants