-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Montgomery reduction inline asm revisited #55
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. It would be nice anyways to add some kind of source link or documentation in regards the algorithm that is being implemented.
As otherwise, with time, it will be challenging to refactor this if we don't know where it came from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! The assembly is a little bit different than https://github.com/mratsim/constantine/blob/151f284/constantine/math/arithmetic/assembly/limbs_asm_redc_mont_x86_adx_bmi2.nim#L187-L230 but the logic seems same to me, not sure how the #cycle compares tho.
Also it'd be nice to rebase to check with the latest CI, it's weird that I couldn't compile locally (need to move the in(reg)
before out(...)
to compile), not sure what's going on here.
src/bn256/assembly.rs
Outdated
// "mov rdx, {inv}", | ||
// "mulx rcx, rdx, r9", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
src/bn256/assembly.rs
Outdated
// "mov rdx, {inv}", | ||
// "mulx rcx, rdx, r10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
src/bn256/assembly.rs
Outdated
// "mov rdx, {inv}", | ||
// "mulx rcx, rdx, r11", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover?
Thanks. Let me check against |
Any news @kilic ?? |
Remote CI also complained so it is fixed too.
Now it should be ready to merge |
This is biproduct of #49 review. So similarly montgommery reduction is written with double carry chain.
montgomery_reduce_short
function is also implemented as @jonathanpwang suggested. For 1M sampleto_repr
function performs like below: