You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Earlier this year, we found an issue (#491) where o1js performance was much slower on Apple silicon machines than would be expected and fixed it with a workaround in #683.
The Chromium team identified the issue in 1228686. It was the result of V8 not using LSE instructions even when the underlying hardware supported them and has since been fixed.
Given the finding below, it makes sense to remove the workaround we applied in #683, as doing so would dramatically improve performance on Apple silicon machines.
The text was updated successfully, but these errors were encountered:
Running src/examples/crypto/ecdsa/run.ts with the current configuration takes 108 seconds to compile and 41 seconds to prove. With getEfficientNumWorkers disabled, it takes 16 seconds to compile and 28 seconds to prove! 😮
Profiling with getEfficientNumWorkers reveals that almost no ticks are used on rdl_dealloc and rdl_alloc:
Preworkaround profile:
ticks total nonlib name
...
79876 41.8% 41.8% JS: *__rdl_dealloc
...
65241 34.1% 34.1% JS: *__rdl_alloc
...
Node.js 20 profile with getEfficientNumWorkers disabled:
ticks total nonlib name
...
1 0.0% 0.0% JS: *__rdl_dealloc
...
1 0.0% 0.0% JS: *__rdl_alloc
...
This means removing getEfficientNumWorkers will substantially improve performance on Apple silicon, and we should do that! 😸
nicc
changed the title
Investigate if Apple silicon performance issue is fixed upstream
Remove Apple silicon performance workaround as this has been fixed upstream
Jan 4, 2024
Earlier this year, we found an issue (#491) where o1js performance was much slower on Apple silicon machines than would be expected and fixed it with a workaround in #683.
The Chromium team identified the issue in 1228686. It was the result of V8 not using LSE instructions even when the underlying hardware supported them and has since been fixed.
Given the finding below, it makes sense to remove the workaround we applied in #683, as doing so would dramatically improve performance on Apple silicon machines.
The text was updated successfully, but these errors were encountered: