🚨 Performance regression in #279 #281

github-actions · 2024-01-24T21:55:08Z

Regression >= 30.0% found during merge of: #279
Commit: 5da330e
Triggered by: https://github.com/lurk-lab/arecibo/actions/runs/7646169891

The text was updated successfully, but these errors were encountered:

huitseeker · 2024-02-02T02:16:37Z

Direct link to comment: 5da330e#commitcomment-137748136

I think that looking at the prior commit, db4375f, as well as a handful of commits before that, the numbers for the "before" baseline were spurious.

@samuelburnham I don't think this could have been a caching issue (unpacking a prior bench result that's not quite for the same machine), would it? As we don't have caching installed.

huitseeker · 2024-02-09T18:04:05Z

OK, so, I've run this benchmarking procedure on three interesting commits, which all resolve grumpkin to:
https://github.com/lurk-lab/grumpkin-msm?branch=dev#29af3cd2
Namely 5da330e which resolves to the commit of a#271, db4375f which resolves to its parent, and 6fde742, which is where the above regression was reproduced.

CompressedSNARK-NIVC-1

	`ref=5da330e-cuda`	`ref=5da330e`	`ref=6fde742-cuda`	`ref=6fde742`	`ref=db4375f-cuda`	`ref=db4375f`
`Prove-NumCons-6540`	`387.09 ms` (✅ 1.00x)	`385.93 ms` (✅ 1.00x faster)	`377.40 ms` (✅ 1.03x faster)	`386.35 ms` (✅ 1.00x faster)	`387.14 ms` (✅ 1.00x slower)	`385.18 ms` (✅ 1.00x faster)
`Verify-NumCons-6540`	`28.78 ms` (✅ 1.00x)	`28.81 ms` (✅ 1.00x slower)	`29.01 ms` (✅ 1.01x slower)	`28.97 ms` (✅ 1.01x slower)	`28.89 ms` (✅ 1.00x slower)	`28.91 ms` (✅ 1.00x slower)

CompressedSNARK-NIVC-2

	`ref=5da330e-cuda`	`ref=5da330e`	`ref=6fde742-cuda`	`ref=6fde742`	`ref=db4375f-cuda`	`ref=db4375f`
`Prove-NumCons-6540`	`402.17 ms` (✅ 1.00x)	`397.98 ms` (✅ 1.01x faster)	`397.78 ms` (✅ 1.01x faster)	`404.92 ms` (✅ 1.01x slower)	`400.66 ms` (✅ 1.00x faster)	`405.44 ms` (✅ 1.01x slower)
`Verify-NumCons-6540`	`29.36 ms` (✅ 1.00x)	`29.94 ms` (✅ 1.02x slower)	`29.96 ms` (✅ 1.02x slower)	`29.66 ms` (✅ 1.01x slower)	`29.60 ms` (✅ 1.01x slower)	`29.46 ms` (✅ 1.00x slower)

CompressedSNARK-NIVC-Commitments-2

	`ref=5da330e-cuda`	`ref=5da330e`	`ref=6fde742-cuda`	`ref=6fde742`	`ref=db4375f-cuda`	`ref=db4375f`
`Prove-NumCons-6540`	`9.49 s` (✅ 1.00x)	`6.64 s` (✅ 1.43x faster)	`9.38 s` (✅ 1.01x faster)	`6.66 s` (✅ 1.43x faster)	`6.63 s` (✅ 1.43x faster)	`6.65 s` (✅ 1.43x faster)
`Verify-NumCons-6540`	`58.11 ms` (✅ 1.00x)	`246.31 ms` (❌ 4.24x slower)	`57.48 ms` (✅ 1.01x faster)	`248.26 ms` (❌ 4.27x slower)	`246.51 ms` (❌ 4.24x slower)	`246.57 ms` (❌ 4.24x slower)

huitseeker · 2024-02-09T18:04:42Z

I tested on the penultimate commit of a#290, which contains no use of the MSM in the IPA (see the PR description for why).

This does tell us (as expected, see a#290) that using the GPU-MSM code path in the IPA (remember, this never resolves to an actual GPU MSM, just SN's CPU one) is faster in both cases (6s vs 9 w/o cuda, 9s vs 12 with), but that the proving / verifying discrepancy is due to stg else.

Benchmark Results

CompressedSNARK-NIVC-1

	`ref=c43fd4c`	`ref=c43fd4c-cuda`
`Prove-NumCons-6540`	`559.73 ms` (✅ 1.00x)	`552.84 ms` (✅ 1.01x faster)
`Verify-NumCons-6540`	`28.94 ms` (✅ 1.00x)	`29.26 ms` (✅ 1.01x slower)

CompressedSNARK-NIVC-2

	`ref=c43fd4c`	`ref=c43fd4c-cuda`
`Prove-NumCons-6540`	`578.43 ms` (✅ 1.00x)	`571.01 ms` (✅ 1.01x faster)
`Verify-NumCons-6540`	`29.65 ms` (✅ 1.00x)	`29.71 ms` (✅ 1.00x slower)

CompressedSNARK-NIVC-Commitments-2

	`ref=c43fd4c`	`ref=c43fd4c-cuda`
`Prove-NumCons-6540`	`9.34 s` (✅ 1.00x)	`12.20 s` (❌ 1.31x slower)
`Verify-NumCons-6540`	`247.61 ms` (✅ 1.00x)	`57.40 ms` (🚀 4.31x faster)

huitseeker · 2024-02-09T18:06:32Z

TL;DR: this is not a regression, but rather a native property of the batched approach (introduced in #131) that it performs better w/o GPU acceleration. This is not due to the final (IPA, in our benches) commitment.

huitseeker · 2024-02-14T14:47:21Z

Tested yesterday: removing parallelism in commitments on CUDA, no change.

github-actions bot added automated issue P-Performance labels Jan 24, 2024

This was referenced Feb 2, 2024

CI: test GPU non-regression + reject large performance regression #243

Closed

Test effect of cuda feature #295

Closed

chore: switch benches to hzkg/bn254 #292

Merged

Rework batches in ppsnark #305

Merged

huitseeker mentioned this issue Feb 20, 2024

Replace unary SNARK / PP-SNARK with their batched variants. #331

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨 Performance regression in #279 #281

🚨 Performance regression in #279 #281

github-actions bot commented Jan 24, 2024

huitseeker commented Feb 2, 2024

huitseeker commented Feb 9, 2024 •

edited

Loading

huitseeker commented Feb 9, 2024

huitseeker commented Feb 9, 2024

huitseeker commented Feb 14, 2024

🚨 Performance regression in #279 #281

🚨 Performance regression in #279 #281

Comments

github-actions bot commented Jan 24, 2024

huitseeker commented Feb 2, 2024

huitseeker commented Feb 9, 2024 • edited Loading

CompressedSNARK-NIVC-1

CompressedSNARK-NIVC-2

CompressedSNARK-NIVC-Commitments-2

huitseeker commented Feb 9, 2024

Benchmark Results

CompressedSNARK-NIVC-1

CompressedSNARK-NIVC-2

CompressedSNARK-NIVC-Commitments-2

huitseeker commented Feb 9, 2024

huitseeker commented Feb 14, 2024

huitseeker commented Feb 9, 2024 •

edited

Loading