-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ducc0 does not compile on MacOS using gcc #39
Comments
I must admit that I don't understand what is going wrong here. |
Hmm ... I don't see any explicit flag for C++17 in the compiler call in the log. Perhaps this is the problem; |
It seems fixed in master. The v34 seems to have the issue, it is missing |
However master fft breaks one of our accuracy tests by a small amount: |
How do we address this? Should I release version 0.35 soon? |
OK, this is strange. I need to take a closer look at the details... |
v35 still has the issue of the missing include. I can point finufft to master but we should check why the accuracy is slightly off with no fast-math |
Version 0.35 is not out yet ... when you say "v35 still has the issue of the missing include", what did you test against? It is possible that FFT accuracy is a tiny little bit worse without fast math, since then FMA instructions are disabled, which incurs two rounding errors instead of one for a multiply-and-add. Still, the difference is so small that it really shouldn't make the difference between pass and fail in a NUFFT test... |
The commit that updated the changelog here: 920155c
We manually add flags for FMA to ducc and finufft to avoid that issue. But it seems that in single precision for type 3 (that does multiple FFTs) the test does not pass. We do a NUFFT with requested precision It is a tiny difference but worth investigating. |
Ah, OK, thanks! I update the ChangeLog for the upcoming version from time to time, to keep track of the new features. As long as the version number in
I'm looking at roughly line 188 in |
@ahbarnett, @lu1and10 what do you think? |
I'm not sure about what the test check tolerance should be, @ahbarnett may have more insight. @ahbarnett The resulting failed error 2.75e-4 is from the line https://github.com/flatironinstitute/finufft/blob/893bf6f2acf8afd4a716d13f534f517c547ccb02/test/finufft3d_test.cpp#L188 |
@mreineck it will be good to release 0.35 with the header fixed sometime, seems the default branch latest commit works, but ducc0_0_34_0 still have issue on MacOs gcc finding min_element header. |
Several issues here: firstly, I thought ducc/CMakeLists.txt and cmake/setupDUCC.cmake enforced this: Now to math tests in FINUFFT: indeed I have been relying on pseudorandom numbers (with fixed seed), and testing both i) the error at one output point (relative to the maximum across all outputs, ie the infinity-norm), and ii) the relative l2 error for the vector of outputs. The point of having both i) and ii) available is that:
Both tests have their place. For small NM as in our CI, eg Now, errors in FINUFFT should be dominated by kernel choice, not by the FFT. For now I am ok with the idea that the rand generator is different in OSX from linux, so FINUFFT single-prec threshold could be raised from 2e-4 to 3e-4. I just noticed the seed is set to the omp_thread_num, so if the test is run on a different number of threads, the results will be different. |
When @DiamonDinoia and I looked at the failing ci runs, it turns out not deterministic. I recall that some times, if we rerun once, the error is less than I'm not sure how to drill down which sub flags inside msvc fp:fast makes the error less. @DiamonDinoia do you know how to debug this in msvc for the detail optimization causing this? Normally I thought fp:fast increase error, but this one fp:fast reduced the error with ducc? |
How to debug? Not sure, I cannot trigger the issue on my windows laptop. Given that is not deterministic, it can be caused by the RNG, msvc is likely to use a different implementation than the others. |
I guess we see the error more often because of the multiple runs with msvc using the ci matrix, if it's deterministic with one thread and the same RNG seed that will be easier to debug. |
Part of the FINUFFT de-macroize process includes using c++ RNGs, some of them like MT32_64 have the same implementation across platform that will allow to have consistent tests on all platforms. So, we can delay this as it seems a testing issue more than a FINUFFT/DUCC issue. |
Sorry for the delay, I took a few days off ...
It can happen if the number of outputs is 1, and I think that is the case here. (It could also happen for more than one output, but I agree that this becomes extremely improbable with increasing number of outputs.) [Edit: The reason why I think this is Marco's statement further above that "when doing a big FFT and checking one resulting point against the analytical formula", but I might be misinterpreting this.] |
Regarding reproducibility of ducc.fft results with multithreading: the design is such that the results should be fully deterministic as long as the requested number of threads does not change, but results from runs with different numbers of threads may be very slightly different. |
Hi Martin before releasing ducc, I would like to improve the cmake file. Can we also add a makefile so to include it in finufft? Also, may I ask why did you revert to |
The issue was that We can add a Makefile as well, but I'm not sure how much value that will add. In the end, compiler flags will always depend on the "calling" application. |
Indeed llvm does not support it yet. That can be removed. All it does is faster complex division/multiplication by skipping -nan checking. One can overload In cmake library flags are usually not set by the user but each flag is compiled with their own. |
I don't think I will go to such lengths just to avoid the |
Ok. We have added ducc0 to our makefile so have ducc fft fully switchable
now. We have also tweaked the test/finufft?d_test executables to exclude
the one-output relative error test from the exit code (when full direct
output error is available), which will reduce random variation. We are
getting 2.3 out and then will be able to think more about templating and
kernel function tweaks, happy to have you join meetings if you like. Best,
Alex
…On Wed, Aug 7, 2024 at 4:31 AM mreineck ***@***.***> wrote:
I don't think I will go to such lengths just to avoid the -ffast-math
flag. As long as it s not used in the linking step, this is not
problematic. And yes, extra checks witin the complex multiplication routine
would be quite bad for some parts of ducc0.
—
Reply to this email directly, view it on GitHub
<#39 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACNZRSRVCGNAIPHVQNJH3LLZQHLLRAVCNFSM6AAAAABLXEMIPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZSHEYTQNRSGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
*-------------------------------------------------------------------~^`^~._.~'
|\ Alex Barnett Center for Computational Mathematics, Flatiron Institute
| \ http://users.flatironinstitute.org/~ahb 646-876-5942
|
I think using The problem with -ffast-math is that a python library compiled with it (python loads shared libraries when loading c/c++) broke my code by changing the floating point env of my cpu breaking the following code. |
I fully agree that Is it possible to discern between these two situations with |
|
in cmake there are:
|
Then I think having |
Closing this for now, as I think all points have been discussed. |
Dear @mreineck,
I do not know if you plan to support ducc0 on MacOS. It works with llvm on that toolchain but it breaks with gcc. It seems a small issue. Could you have a look? We would like to support ducc in finufft for MacOS GCC users.
Please see:
https://github.com/flatironinstitute/finufft/actions/runs/10167458503/job/28119953764
The text was updated successfully, but these errors were encountered: