Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Montgomery reduction inline asm revisited #55

Merged
merged 4 commits into from
Aug 3, 2023

Conversation

kilic
Copy link
Collaborator

@kilic kilic commented Jun 20, 2023

This is biproduct of #49 review. So similarly montgommery reduction is written with double carry chain. montgomery_reduce_short function is also implemented as @jonathanpwang suggested. For 1M sample to_repr function performs like below:

new to_repr ..................................................................29.586ms
old to_repr ..................................................................35.485ms

Copy link
Member

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. It would be nice anyways to add some kind of source link or documentation in regards the algorithm that is being implemented.

As otherwise, with time, it will be challenging to refactor this if we don't know where it came from.

@han0110 han0110 self-requested a review June 22, 2023 03:39
Copy link
Contributor

@han0110 han0110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The assembly is a little bit different than https://github.com/mratsim/constantine/blob/151f284/constantine/math/arithmetic/assembly/limbs_asm_redc_mont_x86_adx_bmi2.nim#L187-L230 but the logic seems same to me, not sure how the #cycle compares tho.

Also it'd be nice to rebase to check with the latest CI, it's weird that I couldn't compile locally (need to move the in(reg) before out(...) to compile), not sure what's going on here.

Comment on lines 273 to 274
// "mov rdx, {inv}",
// "mulx rcx, rdx, r9",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

Comment on lines 298 to 299
// "mov rdx, {inv}",
// "mulx rcx, rdx, r10",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

Comment on lines 325 to 326
// "mov rdx, {inv}",
// "mulx rcx, rdx, r11",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

@kilic
Copy link
Collaborator Author

kilic commented Jun 27, 2023

LGTM! The assembly is a little bit different than https://github.com/mratsim/constantine/blob/151f284/constantine/math/arithmetic/assembly/limbs_asm_redc_mont_x86_adx_bmi2.nim#L187-L230 but the logic seems same to me, not sure how the #cycle compares tho.

Also it'd be nice to rebase to check with the latest CI, it's weird that I couldn't compile locally (need to move the in(reg) before out(...) to compile), not sure what's going on here.

Thanks. Let me check against constantine impl before the merge

@CPerezz
Copy link
Member

CPerezz commented Jul 10, 2023

Any news @kilic ??

@kilic
Copy link
Collaborator Author

kilic commented Aug 3, 2023

Also it'd be nice to rebase to check with the latest CI, it's weird that I couldn't compile locally (need to move the in(reg) before out(...) to compile), not sure what's going on here.

Remote CI also complained so it is fixed too.

Any news @kilic ??

Now it should be ready to merge

@CPerezz CPerezz added this pull request to the merge queue Aug 3, 2023
Merged via the queue into privacy-scaling-explorations:main with commit d3c6a74 Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants