-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Test failures on M1 with BLAS from conda-forge #10759
Comments
Argh, what a bummer!! But also extremely good to know Maybe we should try to set up a self-hosted GHA runner on a spare ARM64 machine. To my knowledge, this is actually free (??) We just need a computer we can spare. |
Yes I think within the last month they added support for self-hosted runners. I don't have a spare M1 machine though. Maybe @agramfort has one sitting around :) |
@larsoner I could offer to set up my regular M1 as a self-hosted runner and only allow jobs to be run at night, when I don't actively use the machine. But I'm currently busy with other things. But just saying that I'm generally willing to do this. |
Still plenty of linalg-related failures even with |
Will check! The last alternative would be netlib |
I did: ❯ mamba create -n mne-dev -c conda-forge/label/mne_dev -c conda-forge mne pytest pytest-cov
❯ mamba run -n mne-dev pip install ~/Development/mne-python
❯ mamba run -n mne-dev pytest ~/Development/mne-python/mne/utils/tests/test_linalg.py The test result is really bad:
So then with ❯ mamba install -n mne-dev "libblas=*=*accelerate"
❯ mamba run -n mne-dev pytest ~/Development/mne-python/mne/utils/tests/test_linalg.py Same results:
Then with ❯ mamba install -n mne-dev "libblas=*=*netlib"
❯ mamba run -n mne-dev pytest ~/Development/mne-python/mne/utils/tests/test_linalg.py Same:
😭 |
Everything passes when using Intel binaries, running through Rosetta ❯ CONDA_SUBDIR=osx-64 mamba create -n mne-dev-intel -c conda-forge/label/mne_dev -c conda-forge mne pytest pytest-cov
❯ mamba run -n mne-dev-intel pip install ~/Development/mne-python
❯ mamba run -n mne-dev-intel pytest ~/Development/mne-python/mne/utils/tests/test_linalg.py
|
@larsoner |
Or it could be f2py or some other generic BLAS/LAPACK bug. So I think there are actually two problems: the SVD one specific to OpenBLAS, and these linear algebra failures across all implementations. To see if it's a cross compilation bug, I could try building with OpenBLAS from brew locally. I already have it from messing with OpenMEEG anyway |
|
... but taking your point about it happening with all implementations, I agree it's probably a cross-compilation problem. I'll open an issue |
If we're really crazy we could compile OpenBLAS natively on our M1 machines and publish the packages on anaconda.org under an |
With mne installed in new virtual-env via |
This implies you installed all dependencies via We already verified that this works. This issue here is about |
FYI this has been fixed by OpenMathLib/OpenBLAS#3675 and there is only one remaining issue in OpenBLAS's 0.3.21 milestone, so it might be worth waiting for this to be fixed. |
@hoechenberger I'm not sure we should wait for this one, it's not clear when they'll actually release. Thoughts on how to move forward? |
We could build and package OpenBLAS ourselves and distribute it via an WDYT? |
Sure, feel free to give it a shot. A commit including OpenMathLib/OpenBLAS#3675 should work! |
@larsoner Do you know if this patch can be applied on the latest stable release? Because then we could try to convince the folks over at https://github.com/conda-forge/openblas-feedstock to include the patch and bump the build number; it would be a fix for M1 and not affect other platforms – we could argue that, effectively, their previous M1 builds are broken, and that this is the correct fix. This approach
WDYT? |
@larsoner I will try to make a PR. |
PR opened at conda-forge/openblas-feedstock#141 |
Yeah that seems better than making our own channel! |
@larsoner Could you take a look at my PR please? Two tests on Travis are timing out and I am as clueless as ever when it comes to compiling stuff 🙃 |
Maybe try closing and reopening the PR to restart CIs |
I did it the fancy way and asked the bot to do it for me 😎 |
I'll close this since it should land on CDNs soon. Let's get the installers going and cut 1.1! |
Please ensure to enforce use of opnenblas build >=1 by explicitly putting it in construct.yaml |
You want to make a PR for this? If not I can do it later today |
... also we need #10945, doc build is broken currently |
I won't have time until tonight (it's currently afternoon here) |
@larsoner you could create a new dev build on mne-feedstock, make a PR to the dev branch there |
In working on #10756 on my M1 I had some test failures related to beamforming. Before we release 1.1 we should fix these. I hopefully will be able to look+fix today.
I'm a bit worried it might be an upstream OpenBLAS bug, which means ideally we'd wait for them to fix something, then NumPy/SciPy to release, then we use the latest version only on M1 :( But if that turns out to be the case, it's possible we can find some workaround, like using a different/slower lapack driver on M1 than the default or something. We'll see!
This is the sad downside of all of these repos not having proper arm64 testing on CIs :(
The text was updated successfully, but these errors were encountered: