Optimize tree-building for neptune/opencl (gpu2 flag). #1422

porcuquine · 2021-03-02T22:41:47Z

This PR supersedes #1420 — since we have decided to keep neptune/opencl behind the gpu2 feature flag for now.

@vmx Can you rebase your CI work on this branch and continue here? I made one modification to the line that actually runs the lifecycle tests. The previous version didn't set the flags, but the new one should correctly specify gpu2.

This commit makes the test actually run.

porcuquine · 2021-03-02T23:46:59Z

Okay, I moved @vmx's CI work over to this branch and adapted it to run lifecycle tests for both gpu and gpu2 features. These are now passing, and I've also run small passing lifecycle tests locally for both.

I've also verified that both the gpu and gpu2 code paths still perform as expected based on current master and the equivalent (but differently factored) code in #1420.

I think this is ready to go once the rest of CI passes.

qy3u · 2021-03-04T03:38:09Z

storage-proofs-porep/src/stacked/vanilla/proof.rs

+                                    let layer_data: &mut Vec<_> = &mut layer_data;
+
+                                    // gather all layer data in parallel.
+                                    s.spawn(move |_| {


I think you may want to spawn in the iteration.The current implemention seems doesn't paralleled actually.

You're right that this spawn seems unnecessary — not sure if it was always like that or got introduced in a refactor at some point. The change you suggest doesn't seem to help though (in my benchmark at least): the bottleneck is in the conversion that follows. Since the goal now is to release with minimal changes, and since all options (status quo, actually spawning in loop, just removing the whole scope/spawn) seem to perform equivalently, I think we should just leave as-is for now — to minimize change and risk, given that this has already been tested/benchmarked/etc quite a bit. It can be removed completely in some future cleanup that aims to improve further.

qy3u · 2021-03-04T03:41:33Z

storage-proofs-porep/src/stacked/vanilla/proof.rs

+                                    let layer_data: &mut Vec<_> = &mut layer_data;
+
+                                    // gather all layer data in parallel.
+                                    s.spawn(move |_| {


@porcuquine lets remove the unneeded spawn at least from the new code path

I have this in testing now with a new issue prepped for the various improvements noted here. If my testing goes ok, I'll merge this and open that issue.

cryptonemo · 2021-03-05T12:57:01Z

Would be interesting to see data on larger/actual tests, but here's a GPU comparison of a small test (32KiB seal lifecycle using first gpu and then gpu2).

Optimize tree-building for neptune/opencl (gpu2 flag).

1cd4c84

porcuquine requested review from cryptonemo and dignifiedquire as code owners March 2, 2021 22:41

Fix CI

9dcd4aa

This commit makes the test actually run.

porcuquine mentioned this pull request Mar 3, 2021

Optimize opencl and make it default gpu feature. #1420

Closed

dignifiedquire approved these changes Mar 3, 2021

View reviewed changes

qy3u reviewed Mar 4, 2021

View reviewed changes

cryptonemo merged commit f937802 into master Mar 5, 2021

cryptonemo deleted the feat/optimize-tree-building-prime branch March 5, 2021 13:29

cryptonemo mentioned this pull request Mar 5, 2021

Improve RAM performance and organization of GPU2 work #1426

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize tree-building for neptune/opencl (gpu2 flag). #1422

Optimize tree-building for neptune/opencl (gpu2 flag). #1422

porcuquine commented Mar 2, 2021

porcuquine commented Mar 2, 2021

qy3u Mar 4, 2021 •

edited

Loading

porcuquine Mar 4, 2021

qy3u Mar 4, 2021

dignifiedquire Mar 5, 2021

cryptonemo Mar 5, 2021

cryptonemo commented Mar 5, 2021 •

edited

Loading

Optimize tree-building for neptune/opencl (gpu2 flag). #1422

Optimize tree-building for neptune/opencl (gpu2 flag). #1422

Conversation

porcuquine commented Mar 2, 2021

porcuquine commented Mar 2, 2021

qy3u Mar 4, 2021 • edited Loading

Choose a reason for hiding this comment

porcuquine Mar 4, 2021

Choose a reason for hiding this comment

qy3u Mar 4, 2021

Choose a reason for hiding this comment

dignifiedquire Mar 5, 2021

Choose a reason for hiding this comment

cryptonemo Mar 5, 2021

Choose a reason for hiding this comment

cryptonemo commented Mar 5, 2021 • edited Loading

qy3u Mar 4, 2021 •

edited

Loading

cryptonemo commented Mar 5, 2021 •

edited

Loading