Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance by reverting Apple silicon workaround (root problem is fixed upstream) #1456

Merged
merged 6 commits into from
Feb 22, 2024

Conversation

jackryanservia
Copy link
Contributor

@jackryanservia jackryanservia commented Feb 22, 2024

Closes: #1337 ;)
Bindings PR: o1-labs/o1js-bindings#251

  • Spawns one worker for each physical core by default (for some reason, numCpus - 1 was faster a month ago, but when I tested today, numCores was slightly faster).
  • Exposes setNumberOfWorkers() to allow developers to manually set the number of workers.
  • Removes detect-gpu dependency

I tested and everything worked well, prover time decrease of ~35% in browser, and ~46% in node. Compile time improvement is even more significant ~60%! (Tested on Apple M1 Pro 10-Core).

I wasn't exactly sure of the best way to implement the setNumberOfWorkers() function, and where everything should live/what it should be named. So I would appreciate any feedback on that kinda thing :)

@jackryanservia jackryanservia requested review from mitschabaude and removed request for mitschabaude February 22, 2024 04:40
@jackryanservia jackryanservia marked this pull request as ready for review February 22, 2024 06:31
Copy link
Collaborator

@mitschabaude mitschabaude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great to have this in!

src/lib/proof-system/workers.ts Show resolved Hide resolved
@jackryanservia jackryanservia merged commit 24537f7 into main Feb 22, 2024
13 checks passed
@jackryanservia jackryanservia deleted the perf/apple-silicon branch February 22, 2024 23:07
@dfstio
Copy link

dfstio commented Mar 6, 2024

My tests show that on M2 Max which has 12 cores, the optimal number of workers is 8, it is 35% faster than with 12 workers.
System info shows:
Total Number of Cores: 12 (8 performance and 4 efficiency)

Can you please set the default number of workers for M2 Max to 8? It seems like 12 now.

@mitschabaude
Copy link
Collaborator

Can you please set the default number of workers for M2 Max to 8? It seems like 12 now.

I don't think this is easily possible because the API that nodejs gives us is availableParallelism() which returns a number of cores. It doesn't distinguish between performance and efficiency cores.

I suggest to just set the number of workers yourself for fine-tuning on particular architectures, seems a bit much for us to test and optimize every possible hardware

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove Apple silicon performance workaround as this has been fixed upstream
3 participants