Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace helicity loops with individual threads #3

Open
roiser opened this issue Aug 12, 2020 · 1 comment
Open

Replace helicity loops with individual threads #3

roiser opened this issue Aug 12, 2020 · 1 comment
Labels
idea Possible new development (may need further discussion)

Comments

@roiser
Copy link
Member

roiser commented Aug 12, 2020

  • Looked into (Mat), last comment was that this showed a 3 % improvement

  • code example in Mat's github

@roiser roiser added the idea Possible new development (may need further discussion) label Aug 12, 2020
@valassi
Copy link
Member

valassi commented Dec 9, 2020

Note: in the vectorization work #71 and #72 the helicity and event loops get reversed. So this would look very different than at the time of the hackathon. Keeping open, maybe still useful to investigate for very complex processes.

valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 23, 2021
…builds.

The build fails on clang10 at compilation time

clang++: /build/gcc/build/contrib/clang-10.0.0/src/clang/10.0.0/tools/clang/lib/CodeGen/CGExpr.cpp:596: clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(const clang::Expr*): Assertion `LV.isSimple()' failed.
Stack dump:
0.      Program arguments: /cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang++ -O3 -std=c++17 -I. -I../../src -I../../../../../tools -DUSE_NVTX -Wall -Wshadow -Wextra -fopenmp -ffast-math -march=skylake-avx512 -mprefer-vector-width=256 -I/usr/local/cuda-11.0/include/ -c CPPProcess.cc -o CPPProcess.o
1.      <eof> parser at end of file
2.      Per-file LLVM IR generation
3.      ../../src/mgOnGpuVectors.h:59:16: Generating code for declaration 'mgOnGpu::cxtype_v::operator[]'
 #0 0x0000000001af5f9a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af5f9a)
 #1 0x0000000001af3d54 llvm::sys::RunSignalHandlers() (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af3d54)
 #2 0x0000000001af3fa9 llvm::sys::CleanupOnSignal(unsigned long) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af3fa9)
 madgraph5#3 0x0000000001a6ed08 CrashRecoverySignalHandler(int) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1a6ed08)
 madgraph5#4 0x00007fd31c178630 __restore_rt (/lib64/libpthread.so.0+0xf630)
 madgraph5#5 0x00007fd31ac8c3d7 raise (/lib64/libc.so.6+0x363d7)
 madgraph5#6 0x00007fd31ac8dac8 abort (/lib64/libc.so.6+0x37ac8)
 madgraph5#7 0x00007fd31ac851a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 madgraph5#8 0x00007fd31ac85252 (/lib64/libc.so.6+0x2f252)
 madgraph5#9 0x000000000203a042 clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(clang::Expr const*) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x203a042)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Apr 23, 2021
-------------------------------------------------------------------------
Process                     = EPOCH1_EEMUMU_CPP
FP precision                = DOUBLE (NaN/abnormal=0, zero=0 )
Internal loops fptype_sv    = VECTOR[1] ('none': scalar, no SIMD)
MatrixElements compiler     = clang 11.0.0
EvtsPerSec[MatrixElems] (3) = ( 1.263547e+06                 )  sec^-1
MeanMatrixElemValue         = ( 1.372113e-02 +- 3.270608e-06 )  GeV^0
TOTAL       :     7.168746 sec
real    0m7.176s
=Symbols in CPPProcess.o= (~sse4: 1241) (avx2:    0) (512y:    0) (512z:    0)
-------------------------------------------------------------------------
Process                     = EPOCH2_EEMUMU_CPP
FP precision                = DOUBLE (NaN/abnormal=0, zero=0 )
MatrixElements compiler     = clang 11.0.0
EvtsPerSec[MatrixElems] (3) = ( 1.218104e+06                 )  sec^-1
MeanMatrixElemValue         = ( 1.372113e-02 +- 3.270608e-06 )  GeV^0
TOTAL       :     7.455322 sec
real    0m7.463s
=Symbols in CPPProcess.o= (~sse4: 1165) (avx2:    0) (512y:    0) (512z:    0)
-------------------------------------------------------------------------

The build with vectors still fails also on clang11 in the same place

clang++: /build/dkonst/CONTRIB/build/contrib/clang-11.0.0/src/clang/11.0.0/clang/lib/CodeGen/CGExpr.cpp:613: clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(const clang::Expr*): Assertion `LV.isSimple()' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang++ -O3 -std=c++17 -I. -I../../src -I../../../../../tools -Wall -Wshadow -Wextra -DMGONGPU_COMMONRAND_ONHOST -ffast-math -march=skylake-avx512 -mprefer-vector-width=256 -c CPPProcess.cc -o CPPProcess.o
1.      <eof> parser at end of file
2.      Per-file LLVM IR generation
3.      ../../src/mgOnGpuVectors.h:59:16: Generating code for declaration 'mgOnGpu::cxtype_v::operator[]'
 #0 0x0000000001ce208a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1ce208a)
 #1 0x0000000001cdfe94 llvm::sys::RunSignalHandlers() (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1cdfe94)
 #2 0x0000000001c52d98 CrashRecoverySignalHandler(int) (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1c52d98)
 madgraph5#3 0x00007f1836000630 __restore_rt (/lib64/libpthread.so.0+0xf630)
 madgraph5#4 0x00007f18350f13d7 raise (/lib64/libc.so.6+0x363d7)
 madgraph5#5 0x00007f18350f2ac8 abort (/lib64/libc.so.6+0x37ac8)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 23, 2022
…ns is different for fcheck

> ./fcheck.exe  2048 64 10
 GPUBLOCKS=          2048
 GPUTHREADS=           64
 NITERATIONS=          10
WARNING! Instantiate host Bridge (nevt=131072)
INFO: The application is built for skylake-avx512 (AVX512VL) and the host supports it
WARNING! Instantiate host Sampler (nevt=131072)
Iteration #1
Iteration #2
Iteration madgraph5#3
Iteration madgraph5#4
Iteration madgraph5#5
Iteration madgraph5#6
Iteration madgraph5#7
Iteration madgraph5#8
Iteration madgraph5#9
WARNING! flagging abnormal ME for ievt=111162
Iteration madgraph5#10
 Average Matrix Element:   1.3716954486179133E-002
 Abnormal MEs:           1

> ./check.exe -p  2048 64 10 | grep FLOAT
FP precision                = FLOAT (NaN/abnormal=2, zero=0)

I imagine that this is because momenta in Fortran get translated from float to double and back to float, while in c++ they stay in float?
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 20, 2022
…failing

patching file Source/dsample.f
Hunk madgraph5#3 FAILED at 181.
Hunk madgraph5#4 succeeded at 197 (offset 2 lines).
Hunk madgraph5#5 FAILED at 211.
Hunk madgraph5#6 succeeded at 893 (offset 3 lines).
2 out of 6 hunks FAILED -- saving rejects to file Source/dsample.f.rej
patching file SubProcesses/addmothers.f
patching file SubProcesses/cuts.f
patching file SubProcesses/makefile
Hunk madgraph5#3 FAILED at 61.
Hunk madgraph5#4 succeeded at 94 (offset 6 lines).
Hunk madgraph5#5 succeeded at 122 (offset 6 lines).
1 out of 5 hunks FAILED -- saving rejects to file SubProcesses/makefile.rej
patching file SubProcesses/reweight.f
Hunk #1 FAILED at 1782.
Hunk #2 succeeded at 1827 (offset 27 lines).
Hunk madgraph5#3 succeeded at 1841 (offset 27 lines).
Hunk madgraph5#4 succeeded at 1963 (offset 27 lines).
1 out of 4 hunks FAILED -- saving rejects to file SubProcesses/reweight.f.rej
patching file auto_dsig.f
Hunk madgraph5#6 FAILED at 301.
Hunk madgraph5#10 succeeded at 773 with fuzz 2 (offset 4 lines).
Hunk madgraph5#11 succeeded at 912 (offset 16 lines).
Hunk madgraph5#12 succeeded at 958 (offset 16 lines).
Hunk madgraph5#13 succeeded at 971 (offset 16 lines).
Hunk madgraph5#14 succeeded at 987 (offset 16 lines).
Hunk madgraph5#15 succeeded at 1006 (offset 16 lines).
Hunk madgraph5#16 succeeded at 1019 (offset 16 lines).
1 out of 16 hunks FAILED -- saving rejects to file auto_dsig.f.rej
patching file driver.f
patching file matrix1.f
patching file auto_dsig1.f
Hunk #2 succeeded at 220 (offset 7 lines).
Hunk madgraph5#3 succeeded at 290 (offset 7 lines).
Hunk madgraph5#4 succeeded at 453 (offset 8 lines).
Hunk madgraph5#5 succeeded at 464 (offset 8 lines).
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 23, 2022
./cmadevent_cudacpp < /tmp/avalassi/pippo

with

cat /tmp/avalassi/pippo
4 ! Number of events in a single C++ iteration (nb_page_loop)
4 1 1 ! Number of events and max and min iterations
0.000001 ! Accuracy (ignored because max iterations = min iterations)
0 ! Grid Adjustment 0=none, 2=adjust (NB if = 0, ftn26 will still be used if present)
1 ! Suppress Amplitude 1=yes (i.e. use MadEvent single-diagram enhancement)
0 ! Helicity Sum/event 0=exact
1 ! Channel number (1-N) for single-diagram enhancement multi-channel (NB NOT IGNORED even if suppress amplitude is 0!)

gives

...
 ihe   2  matrix1_ihel     1933.18258519 matrix1_sumhel    1933.18258519
 ihe   3  matrix1_ihel        0.99461559 matrix1_sumhel    1934.17720077
 ihe   5  matrix1_ihel        7.80045209 matrix1_sumhel    1941.97765287
 ihe   6  matrix1_ihel        0.00006202 matrix1_sumhel    1941.97771489
 ihe   7  matrix1_ihel        0.00006202 matrix1_sumhel    1941.97777691
 ihe   8  matrix1_ihel        0.00000007 matrix1_sumhel    1941.97777697
 ihe   9  matrix1_ihel        0.00000007 matrix1_sumhel    1941.97777704
 ihe  10  matrix1_ihel        0.00006202 matrix1_sumhel    1941.97783906
 ihe  11  matrix1_ihel        0.00006202 matrix1_sumhel    1941.97790108
 ihe  12  matrix1_ihel        7.80045209 matrix1_sumhel    1949.77835317
 ihe  14  matrix1_ihel        0.99461559 matrix1_sumhel    1950.77296876
 ihe  15  matrix1_ihel     1933.18258519 matrix1_sumhel    3883.95555395
 ch    1     amp2ch       0.75549842  amp2sumch       0.75549842    getchcut      1.00000000
 ch    2     amp2ch     765.21526350  amp2sumch     765.97076192    getchcut      1.00000000
 ch    3     amp2ch       0.39661006  amp2sumch     766.36737199    getchcut      1.00000000
  ForCh1      0.01495653  = ans    3883.95555395   /iden     256* amp2ch      0.75549842/amp2sum    766.36737199
  ForCh1      0.01495653 =ForCh0(ans/iden)     15.17170138* amp2ch      0.75549842/amp2sum    766.36737199
 ihe   2  matrix1_ihel      145.09951116 matrix1_sumhel     145.09951116
 ihe   3  matrix1_ihel        6.61872017 matrix1_sumhel     151.71823133
 ihe   5  matrix1_ihel        0.86306130 matrix1_sumhel     152.58129263
 ihe   6  matrix1_ihel        0.00070216 matrix1_sumhel     152.58199479
 ihe   7  matrix1_ihel        0.00070216 matrix1_sumhel     152.58269696
 ihe   8  matrix1_ihel        0.00000170 matrix1_sumhel     152.58269865
 ihe   9  matrix1_ihel        0.00000170 matrix1_sumhel     152.58270035
 ihe  10  matrix1_ihel        0.00070216 matrix1_sumhel     152.58340251
 ihe  11  matrix1_ihel        0.00070216 matrix1_sumhel     152.58410468
 ihe  12  matrix1_ihel        0.86306130 matrix1_sumhel     153.44716598
 ihe  14  matrix1_ihel        6.61872017 matrix1_sumhel     160.06588615
 ihe  15  matrix1_ihel      145.09951116 matrix1_sumhel     305.16539731
 ch    1     amp2ch       5.02520604  amp2sumch       5.02520604    getchcut      1.00000000
 ch    2     amp2ch      81.13881210  amp2sumch      86.16401814    getchcut      1.00000000
 ch    3     amp2ch       3.72070900  amp2sumch      89.88472714    getchcut      1.00000000
  ForCh1      0.06664434  = ans     305.16539731   /iden     256* amp2ch      5.02520604/amp2sum     89.88472714
  ForCh1      0.06664434 =ForCh0(ans/iden)      1.19205233* amp2ch      5.02520604/amp2sum     89.88472714
 ihe   2  matrix1_ihel       29.83550845 matrix1_sumhel      29.83550845
 ihe   3  matrix1_ihel        1.18628332 matrix1_sumhel      31.02179178
 ihe   5  matrix1_ihel       11.22829566 matrix1_sumhel      42.25008743
 ihe   6  matrix1_ihel        4.55355964 matrix1_sumhel      46.80364707
 ihe   7  matrix1_ihel        4.55355964 matrix1_sumhel      51.35720670
 ihe   8  matrix1_ihel        6.00718471 matrix1_sumhel      57.36439141
 ihe   9  matrix1_ihel        6.00718471 matrix1_sumhel      63.37157613
 ihe  10  matrix1_ihel        4.55355964 matrix1_sumhel      67.92513576
 ihe  11  matrix1_ihel        4.55355964 matrix1_sumhel      72.47869540
 ihe  12  matrix1_ihel       11.22829566 matrix1_sumhel      83.70699105
 ihe  14  matrix1_ihel        1.18628332 matrix1_sumhel      84.89327437
 ihe  15  matrix1_ihel       29.83550845 matrix1_sumhel     114.72878283
 ch    1     amp2ch       0.10680425  amp2sumch       0.10680425    getchcut      1.00000000
 ch    2     amp2ch      15.89822055  amp2sumch      16.00502480    getchcut      1.00000000
 ch    3     amp2ch       9.01826666  amp2sumch      25.02329146    getchcut      1.00000000
  ForCh1      0.00191283  = ans     114.72878283   /iden     256* amp2ch      0.10680425/amp2sum     25.02329146
  ForCh1      0.00191283 =ForCh0(ans/iden)      0.44815931* amp2ch      0.10680425/amp2sum     25.02329146
 ihe   2  matrix1_ihel      511.63367052 matrix1_sumhel     511.63367052
 ihe   3  matrix1_ihel        2.81068992 matrix1_sumhel     514.44436044
 ihe   5  matrix1_ihel       12.19377592 matrix1_sumhel     526.63813636
 ihe   6  matrix1_ihel        0.00563681 matrix1_sumhel     526.64377318
 ihe   7  matrix1_ihel        0.00563681 matrix1_sumhel     526.64940999
 ihe   8  matrix1_ihel        0.00003946 matrix1_sumhel     526.64944945
 ihe   9  matrix1_ihel        0.00003946 matrix1_sumhel     526.64948891
 ihe  10  matrix1_ihel        0.00563681 matrix1_sumhel     526.65512572
 ihe  11  matrix1_ihel        0.00563681 matrix1_sumhel     526.66076254
 ihe  12  matrix1_ihel       12.19377592 matrix1_sumhel     538.85453846
 ihe  14  matrix1_ihel        2.81068992 matrix1_sumhel     541.66522838
 ihe  15  matrix1_ihel      511.63367052 matrix1_sumhel    1053.29889890
 ch    1     amp2ch       2.21715024  amp2sumch       2.21715024    getchcut      1.00000000
 ch    2     amp2ch     229.54191357  amp2sumch     231.75906382    getchcut      1.00000000
 ch    3     amp2ch       1.30668433  amp2sumch     233.06574815    getchcut      1.00000000
  ForCh1      0.03914068  = ans    1053.29889890   /iden     256* amp2ch      2.21715024/amp2sum    233.06574815
  ForCh1      0.03914068 =ForCh0(ans/iden)      4.11444882* amp2ch      2.21715024/amp2sum    233.06574815
Event #0 MEch1=-nan = MEch0 15.1706 * num 0 / den 0
Event #1 MEch1=-nan = MEch0 1.192 * num 0 / den 0
Event #2 MEch1=-nan = MEch0 0.448159 * num 0 / den 0
Event madgraph5#3 MEch1=-nan = MEch0 4.1142 * num 0 / den 0
Event #0 MEch1=0.0150196 = MEch0 15.1706 * num 0.758692 / den 766.323
Event #1 MEch1=0.0671816 = MEch0 1.192 * num 5.07394 / den 90.0269
Event #2 MEch1=0.0732451 = MEch0 0.448159 * num 7.92101 / den 48.4657
Event madgraph5#3 MEch1=0.0402084 = MEch0 4.1142 * num 2.27946 / den 233.239
  Event    1  ForCh1      0.01495653  CppCh1      0.01501956  CppCh0     15.17063351
  Event    2  ForCh1      0.06664434  CppCh1      0.06718160  CppCh0      1.19200151
  Event    3  ForCh1      0.00191283  CppCh1      0.07324507  CppCh0      0.44815874
  Event    4  ForCh1      0.03914068  CppCh1      0.04020841  CppCh0      4.11420001
...

A few observations
- why does the cpp loop twice, with the first time printining nans?
- the full ME without multichannel, dubbed ForCh0 and CppCh0, are in good agreement (this was known)
- luckily the amps and amp sums have the same units...
- for these 4 events, amp2sum i.e. denom is 766, 90, 48, 233 in cpp, vs 766, 90, 25, 233 in fortran: why 48 vs 25?!
- for these 4 events, amp2 for ch1 i.e. num is 0.8, 5.1, 7.9, 2.3 in cpp, vs 0.8, 5.0, 0.1, 2.2 in ftr: why 7.9 vs 0.1?
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 24, 2022
Some clear differences start to emerge

  ForCh1      0.06664434  = ans     305.16539731   /iden     256* amp2ch      5.02520604/amp2sum     89.88472714
  ForCh1      0.06664434 =ForCh0(ans/iden)      1.19205233* amp2ch      5.02520604/amp2sum     89.88472714
matrix1  amp2ch1       0.02670106 amp2ch2       4.50704307 amp2ch3       2.04053476 amp2tot       6.57427890
matrix1  amp2ch1       0.02670106 amp2ch2       0.05168202 amp2ch3       0.23119429 amp2tot       6.88385627
matrix1  amp2ch1       0.00000000 amp2ch2       1.44512113 amp2ch3       0.95367341 amp2tot       9.28265080
matrix1  amp2ch1       0.00000000 amp2ch2       0.58605913 amp2ch3       0.38675582 amp2tot      10.25546576
matrix1  amp2ch1       0.00000000 amp2ch2       0.58605913 amp2ch3       0.38675582 amp2tot      11.22828072
matrix1  amp2ch1       0.00000000 amp2ch2       0.77314579 amp2ch3       0.51021922 amp2tot      12.51164573
matrix1  amp2ch1       0.00000000 amp2ch2       0.77314579 amp2ch3       0.51021922 amp2tot      13.79501075
matrix1  amp2ch1       0.00000000 amp2ch2       0.58605913 amp2ch3       0.38675582 amp2tot      14.76782570
matrix1  amp2ch1       0.00000000 amp2ch2       0.58605913 amp2ch3       0.38675582 amp2tot      15.74064066
matrix1  amp2ch1       0.00000000 amp2ch2       1.44512113 amp2ch3       0.95367341 amp2tot      18.13943519
matrix1  amp2ch1       0.02670106 amp2ch2       0.05168202 amp2ch3       0.23119429 amp2tot      18.44901256
matrix1  amp2ch1       0.02670106 amp2ch2       4.50704307 amp2ch3       2.04053476 amp2tot      25.02329146
 ch    1     amp2ch       0.10680425  amp2sumch       0.10680425    getchcut      1.00000000
 ch    2     amp2ch      15.89822055  amp2sumch      16.00502480    getchcut      1.00000000
 ch    3     amp2ch       9.01826666  amp2sumch      25.02329146    getchcut      1.00000000
  ForCh1      0.00191283  = ans     114.72878283   /iden     256* amp2ch      0.10680425/amp2sum     25.02329146
  ForCh1      0.00191283 =ForCh0(ans/iden)      0.44815931* amp2ch      0.10680425/amp2sum     25.02329146
 ch    1     amp2ch       2.21715024  amp2sumch       2.21715024    getchcut      1.00000000
 ch    2     amp2ch     229.54191357  amp2sumch     231.75906382    getchcut      1.00000000
 ch    3     amp2ch       1.30668433  amp2sumch     233.06574815    getchcut      1.00000000
  ForCh1      0.03914068  = ans    1053.29889890   /iden     256* amp2ch      2.21715024/amp2sum    233.06574815
  ForCh1      0.03914068 =ForCh0(ans/iden)      4.11444882* amp2ch      2.21715024/amp2sum    233.06574815
Event #0 MEch1=-nan = MEch0 15.1706 * num 0 / den 0
Event #1 MEch1=-nan = MEch0 1.192 * num 0 / den 0
Event #2 MEch1=-nan = MEch0 0.448159 * num 0 / den 0
Event madgraph5#3 MEch1=-nan = MEch0 4.1142 * num 0 / den 0
ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=0.0267011
ievt0=2, diag2, amp2=4.50701, sumamp2(denom)=4.53371
ievt0=2, diag3, amp2=2.04053, sumamp2(denom)=6.57424
ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=8.52779
ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=10.4813
ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=12.4349
ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=14.3884
ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=16.342
ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=18.2955
ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=18.3222
ievt0=2, diag2, amp2=0.0516816, sumamp2(denom)=18.3739
ievt0=2, diag3, amp2=0.231193, sumamp2(denom)=18.6051
ievt0=2, diag1, amp2=0, sumamp2(denom)=18.6051
ievt0=2, diag2, amp2=0.586055, sumamp2(denom)=19.1911
ievt0=2, diag3, amp2=0.386754, sumamp2(denom)=19.5779

These two sequences are the same for the first helicity, completely different for the second one?

Fortran
matrix1  amp2ch1       0.02670106 amp2ch2       4.50704307 amp2ch3       2.04053476 amp2tot       6.57427890
matrix1  amp2ch1       0.02670106 amp2ch2       0.05168202 amp2ch3       0.23119429 amp2tot       6.88385627

Cpp
ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=0.0267011
ievt0=2, diag2, amp2=4.50701, sumamp2(denom)=4.53371
ievt0=2, diag3, amp2=2.04053, sumamp2(denom)=6.57424
ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=8.52779
ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=10.4813
ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=12.4349
jtchilders pushed a commit to jtchilders/madgraph4gpu that referenced this issue Nov 15, 2022
…o_main_tmp

Br golden epoch x4 to main tmp
valassi pushed a commit to valassi/madgraph4gpu that referenced this issue Jul 13, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this issue May 17, 2024
…#845 in log_gqttq_mad_f_inl0_hrd0.txt, the rest as expected

STARTED  AT Thu May 16 01:24:16 AM CEST 2024
(SM tests)
ENDED(1) AT Thu May 16 05:58:45 AM CEST 2024 [Status=0]
(BSM tests)
ENDED(1) AT Thu May 16 06:07:42 AM CEST 2024 [Status=0]

24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
18 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt
1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt

The new issue madgraph5#845 is the following
+Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
+
+Backtrace for this error:
+#0  0x7f2a1a623860 in ???
+#1  0x7f2a1a622a05 in ???
+#2  0x7f2a1a254def in ???
+madgraph5#3  0x7f2a1ae20acc in ???
+madgraph5#4  0x7f2a1acc4575 in ???
+madgraph5#5  0x7f2a1ae1d4c9 in ???
+madgraph5#6  0x7f2a1ae2570d in ???
+madgraph5#7  0x7f2a1ae2afa1 in ???
+madgraph5#8  0x43008b in ???
+madgraph5#9  0x431c10 in ???
+madgraph5#10  0x432d47 in ???
+madgraph5#11  0x433b1e in ???
+madgraph5#12  0x44a921 in ???
+madgraph5#13  0x42ebbf in ???
+madgraph5#14  0x40371e in ???
+madgraph5#15  0x7f2a1a23feaf in ???
+madgraph5#16  0x7f2a1a23ff5f in ???
+madgraph5#17  0x403844 in ???
+madgraph5#18  0xffffffffffffffff in ???
+./madX.sh: line 379: 3004240 Floating point exception(core dumped) $timecmd $cmd < ${tmpin} > ${tmp}
+ERROR! ' ./build.512z_f_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_gqttq_x10_cudacpp > /tmp/avalassi/output_gqttq_x10_cudacpp' failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Possible new development (may need further discussion)
Projects
None yet
Development

No branches or pull requests

2 participants