Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/rsa: Go 1.21 follow-up work #57752

Open
7 of 12 tasks
FiloSottile opened this issue Jan 12, 2023 · 10 comments
Open
7 of 12 tasks

crypto/rsa: Go 1.21 follow-up work #57752

FiloSottile opened this issue Jan 12, 2023 · 10 comments
Assignees
Labels
Milestone

Comments

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/492935 mentions this issue: crypto/rsa,crypto/internal/bigmod: optimized short exponentiations

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/471259 mentions this issue: crypto/internal/bigmod: switch to saturated limbs

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/492955 mentions this issue: crypto/ed25519,crypto/rsa: make Equal methods constant time

gopherbot pushed a commit that referenced this issue May 17, 2023
Fixes #53849
Updates #57752

Change-Id: I055564f31a47c79565b82bf9844fcf626989b295
Reviewed-on: https://go-review.googlesource.com/c/go/+/492955
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue May 24, 2023
Turns out that unsaturated limbs being more performant for Montgomery
multiplication was true in portable C89, but is now a misconception.
With add-with-carry instructions, it's possible to run the carry chain
across the limbs, instead of needing the limb-by-limb product to fit in
two words.

Switch to saturated limbs, and import the same Montgomery loop as
math/big, along with its assembly for some architectures. Since here we
know the sizes we care about, we can drop most of the assembly
scaffolding. For amd64, ported to avo, too.

We recover all the Go 1.20 performance loss on private key operations on
both Intel Xeon and AMD EPYC, with even a 10% improvement over Go 1.19
(which used variable-time math/big) for some operations.

goos: linux
goarch: amd64
pkg: crypto/rsa
cpu: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
                       │ go1.19.txt  │       go1.20.txt         │         new.txt          │
                       │   sec/op    │    sec/op      vs base   │    sec/op      vs base   │
DecryptPKCS1v15/2048-4   1.175m ± 0%     1.515m ± 0%    +28.95%     1.132m ± 0%     -3.59%
DecryptPKCS1v15/3072-4   3.428m ± 1%     4.516m ± 0%    +31.75%     3.198m ± 0%     -6.69%
DecryptPKCS1v15/4096-4   7.405m ± 0%    10.092m ± 0%    +36.29%     6.446m ± 0%    -12.95%
EncryptPKCS1v15/2048-4   7.426µ ± 0%   170.829µ ± 0%  +2200.57%   131.874µ ± 0%  +1675.97%
DecryptOAEP/2048-4       1.175m ± 0%     1.524m ± 0%    +29.68%     1.137m ± 0%     -3.26%
EncryptOAEP/2048-4       9.609µ ± 0%   173.008µ ± 0%  +1700.48%   132.344µ ± 0%  +1277.29%
SignPKCS1v15/2048-4      1.181m ± 0%     1.563m ± 0%    +32.34%     1.177m ± 0%     -0.37% 
VerifyPKCS1v15/2048-4    6.452µ ± 0%   170.092µ ± 0%  +2536.06%   131.225µ ± 0%  +1933.70%
SignPSS/2048-4           1.184m ± 0%     1.574m ± 0%    +32.88%     1.175m ± 0%     -0.84%
VerifyPSS/2048-4         9.151µ ± 1%   172.909µ ± 0%  +1789.50%   132.391µ ± 0%  +1346.74%

                       │  go1.19.txt   │      go1.20.txt       │       new.txt         │
                       │     B/op      │     B/op      vs base │     B/op      vs base │
DecryptPKCS1v15/2048-4    24266.5 ± 0%     640.0 ± 0%  -97.36%     640.0 ± 0%  -97.36%
DecryptPKCS1v15/3072-4   45.465Ki ± 0%   3.375Ki ± 0%  -92.58%   4.688Ki ± 0%  -89.69%
DecryptPKCS1v15/4096-4   61.080Ki ± 0%   4.625Ki ± 0%  -92.43%   6.250Ki ± 0%  -89.77%
EncryptPKCS1v15/2048-4    3.138Ki ± 0%   1.146Ki ± 0%  -63.49%   1.082Ki ± 0%  -65.52%
DecryptOAEP/2048-4        24500.0 ± 0%     872.0 ± 0%  -96.44%     872.0 ± 0%  -96.44%
EncryptOAEP/2048-4        3.610Ki ± 0%   1.371Ki ± 0%  -62.02%   1.308Ki ± 0%  -63.78%
SignPKCS1v15/2048-4       26933.0 ± 0%     896.0 ± 0%  -96.67%     896.0 ± 0%  -96.67%
VerifyPKCS1v15/2048-4      3209.0 ± 0%     912.0 ± 0%  -71.58%     848.0 ± 0%  -73.57%
SignPSS/2048-4           26.940Ki ± 0%   1.266Ki ± 0%  -95.30%   1.266Ki ± 0%  -95.30%
VerifyPSS/2048-4          3.337Ki ± 0%   1.094Ki ± 0%  -67.22%   1.031Ki ± 0%  -69.10%

                       │  go1.19.txt  │     go1.20.txt      │      new.txt          │
                       │  allocs/op   │ allocs/op   vs base │ allocs/op   vs base   │
DecryptPKCS1v15/2048-4    97.000 ± 0%   4.000 ± 0%  -95.88%     4.000 ± 0%  -95.88%
DecryptPKCS1v15/3072-4    107.00 ± 0%   10.00 ± 0%  -90.65%     12.00 ± 0%  -88.79%
DecryptPKCS1v15/4096-4    113.00 ± 0%   10.00 ± 0%  -91.15%     12.00 ± 0%  -89.38%
EncryptPKCS1v15/2048-4     7.000 ± 0%   7.000 ± 0%        ~     7.000 ± 0%        ~  
DecryptOAEP/2048-4        103.00 ± 0%   10.00 ± 0%  -90.29%     10.00 ± 0%  -90.29%
EncryptOAEP/2048-4         14.00 ± 0%   13.00 ± 0%   -7.14%     13.00 ± 0%   -7.14%
SignPKCS1v15/2048-4      102.000 ± 0%   5.000 ± 0%  -95.10%     5.000 ± 0%  -95.10%
VerifyPKCS1v15/2048-4      7.000 ± 0%   6.000 ± 0%  -14.29%     6.000 ± 0%  -14.29%
SignPSS/2048-4            108.00 ± 0%   10.00 ± 0%  -90.74%     10.00 ± 0%  -90.74%
VerifyPSS/2048-4           12.00 ± 0%   11.00 ± 0%   -8.33%     11.00 ± 0%   -8.33%

goos: linux
goarch: amd64
pkg: crypto/rsa
cpu: AMD EPYC 7R13 Processor
                       │ go1.19a.txt │       go1.20a.txt        │        newa.txt          │
                       │   sec/op    │    sec/op      vs base   │    sec/op      vs base   │
DecryptPKCS1v15/2048-4   970.0µ ± 0%    1667.6µ ± 0%    +71.92%     951.6µ ± 0%     -1.90%
DecryptPKCS1v15/3072-4   2.949m ± 0%     5.124m ± 0%    +73.75%     2.675m ± 0%     -9.29%
DecryptPKCS1v15/4096-4   6.350m ± 0%    11.660m ± 0%    +83.62%     5.746m ± 0%     -9.51%
EncryptPKCS1v15/2048-4   6.605µ ± 1%   183.807µ ± 0%  +2683.05%   123.720µ ± 0%  +1773.27%
DecryptOAEP/2048-4       973.8µ ± 0%    1670.8µ ± 0%    +71.57%     951.8µ ± 0%     -2.27% 
EncryptOAEP/2048-4       8.444µ ± 1%   185.889µ ± 0%  +2101.56%   124.142µ ± 0%  +1370.27%
SignPKCS1v15/2048-4      976.8µ ± 0%    1725.5µ ± 0%    +76.65%     979.6µ ± 0%     +0.28%
VerifyPKCS1v15/2048-4    5.713µ ± 0%   182.983µ ± 0%  +3103.19%   122.737µ ± 0%  +2048.56%
SignPSS/2048-4           980.3µ ± 0%    1729.5µ ± 0%    +76.42%     985.7µ ± 3%     +0.55%
VerifyPSS/2048-4         8.168µ ± 1%   185.312µ ± 0%  +2168.76%   123.772µ ± 0%  +1415.33%

Fixes #59463
Fixes #59442
Updates #57752

Change-Id: I311a9c1f4f5288e47e53ca14f615a443f3132734
Reviewed-on: https://go-review.googlesource.com/c/go/+/471259
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue May 24, 2023
RSA encryption and verification performs an exponentiation by a value
usually just a few bits long. The current strategy with table
precomputation is not efficient.

Add an ExpShort bigmod method, and use it in RSA public key operations.

After this, almost all CPU time in encryption/verification is spent
preparing the constants for the modulus, because PublicKey doesn't have
a Precompute function.

This speeds up signing a bit too, because it performs a verification to
protect against faults.

name                    old time/op  new time/op  delta
DecryptPKCS1v15/2048-4  1.13ms ± 0%  1.13ms ± 0%   -0.43%  (p=0.000 n=8+9)
DecryptPKCS1v15/3072-4  3.20ms ± 0%  3.15ms ± 0%   -1.59%  (p=0.000 n=10+8)
DecryptPKCS1v15/4096-4  6.45ms ± 0%  6.42ms ± 0%   -0.49%  (p=0.000 n=10+10)
EncryptPKCS1v15/2048-4   132µs ± 0%   108µs ± 0%  -17.99%  (p=0.000 n=10+10)
DecryptOAEP/2048-4      1.13ms ± 0%  1.14ms ± 0%   +0.91%  (p=0.000 n=10+10)
EncryptOAEP/2048-4       132µs ± 0%   108µs ± 0%  -18.09%  (p=0.000 n=10+10)
SignPKCS1v15/2048-4     1.18ms ± 0%  1.14ms ± 1%   -3.30%  (p=0.000 n=10+10)
VerifyPKCS1v15/2048-4    131µs ± 0%   107µs ± 0%  -18.30%  (p=0.000 n=9+10)
SignPSS/2048-4          1.18ms ± 0%  1.15ms ± 1%   -1.87%  (p=0.000 n=10+10)
VerifyPSS/2048-4         132µs ± 0%   108µs ± 0%  -18.30%  (p=0.000 n=10+9)

Updates #57752

Change-Id: Ic89273a58002b32b1c5c3185a35262694ceef409
Reviewed-on: https://go-review.googlesource.com/c/go/+/492935
Run-TryBot: Filippo Valsorda <filippo@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/504879 mentions this issue: doc/go1.21: correct GOOS to GOARCH

gopherbot pushed a commit that referenced this issue Jun 21, 2023
For #57752
Fixes #60924

Change-Id: Ie1e16c041885abb51dd6c2f0b7dfa03091cfb338
Reviewed-on: https://go-review.googlesource.com/c/go/+/504879
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
TryBot-Bypass: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
nkhogen added a commit to yugabyte/yugabyte-db that referenced this issue Jul 4, 2023
…over network with high latency

Summary:
This is an ongoing fix. This reduces the memory consumption by removing write to a tmp buffer when it is no longer needed + some cleanups around user.

Mostly golang crypto issue which is getting addressed in 1.21 release - golang/go#57752

{F83779}

Test Plan: Create universe

Reviewers: amalyshev, cwang

Reviewed By: amalyshev

Subscribers: yugaware

Differential Revision: https://phorge.dev.yugabyte.com/D26586
nkhogen added a commit to yugabyte/yugabyte-db that referenced this issue Jul 6, 2023
…dpoint is slow over network with high latency

Summary:
Original diff - https://phorge.dev.yugabyte.com/D26586 (70b7ac9)

This is an ongoing fix. This reduces the memory consumption by removing write to a tmp buffer when it is no longer needed + some cleanups around user.

Mostly golang crypto issue which is getting addressed in 1.21 release - golang/go#57752

{F83779}

Test Plan: Create universe

Reviewers: amalyshev, cwang

Reviewed By: amalyshev

Subscribers: yugaware

Differential Revision: https://phorge.dev.yugabyte.com/D26644
bradfitz pushed a commit to tailscale/go that referenced this issue Jul 15, 2023
For golang#57752
Fixes golang#60924

Change-Id: Ie1e16c041885abb51dd6c2f0b7dfa03091cfb338
Reviewed-on: https://go-review.googlesource.com/c/go/+/504879
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
TryBot-Bypass: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
qiulaidongfeng added a commit to qiulaidongfeng/go that referenced this issue Oct 3, 2023
…s a pointer

Fixes golang#49136
For golang#57752

Change-Id: I6199ee5e1aa3b1ea7a152f5a5321d2c474a7781c
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/532335 mentions this issue: crypto/rsa: PublicKey.Equal() documentation specifies that it requires a pointer

@qiulaidongfeng
Copy link
Member

@FiloSottile Can you review CL 532335

qiulaidongfeng added a commit to qiulaidongfeng/go that referenced this issue Dec 9, 2023
…s a pointer

Fixes golang#49136
For golang#57752

Change-Id: I6199ee5e1aa3b1ea7a152f5a5321d2c474a7781c
qiulaidongfeng added a commit to qiulaidongfeng/go that referenced this issue Jan 9, 2024
…s a pointer

Fixes golang#49136
For golang#57752

Change-Id: I6199ee5e1aa3b1ea7a152f5a5321d2c474a7781c
@gopherbot gopherbot modified the milestones: Go1.22, Go1.23 Feb 6, 2024
@gopherbot gopherbot removed this from the Go1.23 milestone Aug 13, 2024
@gopherbot gopherbot added this to the Go1.24 milestone Aug 13, 2024
@hkishn
Copy link
Contributor

hkishn commented Sep 18, 2024

Is this issue fixed ? I can see similar performance drop in golang 1.21 on architecture x86_64. On profiling I noticed that rsa(.*PrivateKey).Precompute is taking more time.

@FiloSottile
Copy link
Contributor Author

@hkishn Go 1.21 is unsupported. The performance regression was fixed in Go 1.22.

@hkishn
Copy link
Contributor

hkishn commented Sep 18, 2024

@FiloSottile
I referred this https://tip.golang.org/doc/go1.21 . It says the regression is fixed in go 1.21.
image

@hkishn
Copy link
Contributor

hkishn commented Sep 19, 2024

@FiloSottile is the doc wrongly updated ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants