-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single precision average ME is not the same for CUDA and C++ in single-precision (ggttgg and eemumu) #212
Comments
(This is related to #5 by the way). Having a quick update on this after a few months. The issue is always there: in single precision, there are a few Nans, example madgraph4gpu/epochX/cudacpp/tput/logs_eemumu_manu/log_eemumu_manu_f_inl0_hrd0.txt Line 112 in a698c62
I had even done some minimal debugging at some point (mainly to understand how to detect "NaN" at all, when fast math is enabled! See
There is some interesting work to be done here, which however is largely debugging. For instance:
This is not an academic exercise. The final goal of this study is to try and understand if the matrix element calculations can be moved from double to single precision. This would mean a factor 2 speedup both in vectorized c++ (twice as many elements in SIMD vectors) and in CUDA (typically, twice as many FLOPs on Nvidia data center cards) |
(This is also related to #117 where fast math first appeared..) |
…sults are the same in double, but nans differ in float
…ower, results are the same in double, but nans differ in float" This reverts commit 45b7b33.
I have just made a small test in a PT that I am about to merge I have disabled fast math in eemumu and run double and float, results
|
…y done for C++) - now 'make FPTYPE=f check' succeeds! - see madgraph5#5, madgraph5#212
…e and float (see madgraph5#5 and madgraph5#212)
…sults are the same in double, but nans differ in float
…ower, results are the same in double, but nans differ in float" This reverts commit 45b7b33.
As discussed in PR #211, single precision average ME is not the same for CUDA and C++ in single-precision ggttgg
See for instance valassi@a75ee3b#diff-45e40fdc2f6b7c71419c9f5e7e36267d7951e21c32488d6ecf35de3ec28ced57
In double precision, results are similar to those, butnot the same, and they are the same as each other to more digits
valassi@33e7c04#diff-45e40fdc2f6b7c71419c9f5e7e36267d7951e21c32488d6ecf35de3ec28ced57
Note that for eemumu, in single precision the same average ME is printed out (if I remember correctly?)
NO, I remember badly. On eemumu, on MANY more events, I get a different number of NaNs!
And as a consequence also a different average ME
7173757#diff-6716e7ab4317b4e76c92074d38021be37ad0eda68f248fb16f11e679f26114a6
So there is clearly some numerical precision to investigate also for eemumu
The text was updated successfully, but these errors were encountered: