Replace helicity loops with individual threads #3

roiser · 2020-08-12T15:08:47Z

Looked into (Mat), last comment was that this showed a 3 % improvement
code example in Mat's github

valassi · 2020-12-09T22:31:22Z

Note: in the vectorization work #71 and #72 the helicity and event loops get reversed. So this would look very different than at the time of the hackathon. Keeping open, maybe still useful to investigate for very complex processes.

…builds. The build fails on clang10 at compilation time clang++: /build/gcc/build/contrib/clang-10.0.0/src/clang/10.0.0/tools/clang/lib/CodeGen/CGExpr.cpp:596: clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(const clang::Expr*): Assertion `LV.isSimple()' failed. Stack dump: 0. Program arguments: /cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang++ -O3 -std=c++17 -I. -I../../src -I../../../../../tools -DUSE_NVTX -Wall -Wshadow -Wextra -fopenmp -ffast-math -march=skylake-avx512 -mprefer-vector-width=256 -I/usr/local/cuda-11.0/include/ -c CPPProcess.cc -o CPPProcess.o 1. <eof> parser at end of file 2. Per-file LLVM IR generation 3. ../../src/mgOnGpuVectors.h:59:16: Generating code for declaration 'mgOnGpu::cxtype_v::operator[]' #0 0x0000000001af5f9a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af5f9a) #1 0x0000000001af3d54 llvm::sys::RunSignalHandlers() (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af3d54) #2 0x0000000001af3fa9 llvm::sys::CleanupOnSignal(unsigned long) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1af3fa9) madgraph5#3 0x0000000001a6ed08 CrashRecoverySignalHandler(int) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x1a6ed08) madgraph5#4 0x00007fd31c178630 __restore_rt (/lib64/libpthread.so.0+0xf630) madgraph5#5 0x00007fd31ac8c3d7 raise (/lib64/libc.so.6+0x363d7) madgraph5#6 0x00007fd31ac8dac8 abort (/lib64/libc.so.6+0x37ac8) madgraph5#7 0x00007fd31ac851a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6) madgraph5#8 0x00007fd31ac85252 (/lib64/libc.so.6+0x2f252) madgraph5#9 0x000000000203a042 clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(clang::Expr const*) (/cvmfs/sft.cern.ch/lcg/releases/clang/10.0.0-62e61/x86_64-centos7/bin/clang+++0x203a042)

------------------------------------------------------------------------- Process = EPOCH1_EEMUMU_CPP FP precision = DOUBLE (NaN/abnormal=0, zero=0 ) Internal loops fptype_sv = VECTOR[1] ('none': scalar, no SIMD) MatrixElements compiler = clang 11.0.0 EvtsPerSec[MatrixElems] (3) = ( 1.263547e+06 ) sec^-1 MeanMatrixElemValue = ( 1.372113e-02 +- 3.270608e-06 ) GeV^0 TOTAL : 7.168746 sec real 0m7.176s =Symbols in CPPProcess.o= (~sse4: 1241) (avx2: 0) (512y: 0) (512z: 0) ------------------------------------------------------------------------- Process = EPOCH2_EEMUMU_CPP FP precision = DOUBLE (NaN/abnormal=0, zero=0 ) MatrixElements compiler = clang 11.0.0 EvtsPerSec[MatrixElems] (3) = ( 1.218104e+06 ) sec^-1 MeanMatrixElemValue = ( 1.372113e-02 +- 3.270608e-06 ) GeV^0 TOTAL : 7.455322 sec real 0m7.463s =Symbols in CPPProcess.o= (~sse4: 1165) (avx2: 0) (512y: 0) (512z: 0) ------------------------------------------------------------------------- The build with vectors still fails also on clang11 in the same place clang++: /build/dkonst/CONTRIB/build/contrib/clang-11.0.0/src/clang/11.0.0/clang/lib/CodeGen/CGExpr.cpp:613: clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitReferenceBindingToExpr(const clang::Expr*): Assertion `LV.isSimple()' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang++ -O3 -std=c++17 -I. -I../../src -I../../../../../tools -Wall -Wshadow -Wextra -DMGONGPU_COMMONRAND_ONHOST -ffast-math -march=skylake-avx512 -mprefer-vector-width=256 -c CPPProcess.cc -o CPPProcess.o 1. <eof> parser at end of file 2. Per-file LLVM IR generation 3. ../../src/mgOnGpuVectors.h:59:16: Generating code for declaration 'mgOnGpu::cxtype_v::operator[]' #0 0x0000000001ce208a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1ce208a) #1 0x0000000001cdfe94 llvm::sys::RunSignalHandlers() (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1cdfe94) #2 0x0000000001c52d98 CrashRecoverySignalHandler(int) (/cvmfs/sft.cern.ch/lcg/releases/clang/11.0.0-77a9f/x86_64-centos7/bin/clang+++0x1c52d98) madgraph5#3 0x00007f1836000630 __restore_rt (/lib64/libpthread.so.0+0xf630) madgraph5#4 0x00007f18350f13d7 raise (/lib64/libc.so.6+0x363d7) madgraph5#5 0x00007f18350f2ac8 abort (/lib64/libc.so.6+0x37ac8)

…ns is different for fcheck > ./fcheck.exe 2048 64 10 GPUBLOCKS= 2048 GPUTHREADS= 64 NITERATIONS= 10 WARNING! Instantiate host Bridge (nevt=131072) INFO: The application is built for skylake-avx512 (AVX512VL) and the host supports it WARNING! Instantiate host Sampler (nevt=131072) Iteration #1 Iteration #2 Iteration madgraph5#3 Iteration madgraph5#4 Iteration madgraph5#5 Iteration madgraph5#6 Iteration madgraph5#7 Iteration madgraph5#8 Iteration madgraph5#9 WARNING! flagging abnormal ME for ievt=111162 Iteration madgraph5#10 Average Matrix Element: 1.3716954486179133E-002 Abnormal MEs: 1 > ./check.exe -p 2048 64 10 | grep FLOAT FP precision = FLOAT (NaN/abnormal=2, zero=0) I imagine that this is because momenta in Fortran get translated from float to double and back to float, while in c++ they stay in float?

…failing patching file Source/dsample.f Hunk madgraph5#3 FAILED at 181. Hunk madgraph5#4 succeeded at 197 (offset 2 lines). Hunk madgraph5#5 FAILED at 211. Hunk madgraph5#6 succeeded at 893 (offset 3 lines). 2 out of 6 hunks FAILED -- saving rejects to file Source/dsample.f.rej patching file SubProcesses/addmothers.f patching file SubProcesses/cuts.f patching file SubProcesses/makefile Hunk madgraph5#3 FAILED at 61. Hunk madgraph5#4 succeeded at 94 (offset 6 lines). Hunk madgraph5#5 succeeded at 122 (offset 6 lines). 1 out of 5 hunks FAILED -- saving rejects to file SubProcesses/makefile.rej patching file SubProcesses/reweight.f Hunk #1 FAILED at 1782. Hunk #2 succeeded at 1827 (offset 27 lines). Hunk madgraph5#3 succeeded at 1841 (offset 27 lines). Hunk madgraph5#4 succeeded at 1963 (offset 27 lines). 1 out of 4 hunks FAILED -- saving rejects to file SubProcesses/reweight.f.rej patching file auto_dsig.f Hunk madgraph5#6 FAILED at 301. Hunk madgraph5#10 succeeded at 773 with fuzz 2 (offset 4 lines). Hunk madgraph5#11 succeeded at 912 (offset 16 lines). Hunk madgraph5#12 succeeded at 958 (offset 16 lines). Hunk madgraph5#13 succeeded at 971 (offset 16 lines). Hunk madgraph5#14 succeeded at 987 (offset 16 lines). Hunk madgraph5#15 succeeded at 1006 (offset 16 lines). Hunk madgraph5#16 succeeded at 1019 (offset 16 lines). 1 out of 16 hunks FAILED -- saving rejects to file auto_dsig.f.rej patching file driver.f patching file matrix1.f patching file auto_dsig1.f Hunk #2 succeeded at 220 (offset 7 lines). Hunk madgraph5#3 succeeded at 290 (offset 7 lines). Hunk madgraph5#4 succeeded at 453 (offset 8 lines). Hunk madgraph5#5 succeeded at 464 (offset 8 lines).

./cmadevent_cudacpp < /tmp/avalassi/pippo with cat /tmp/avalassi/pippo 4 ! Number of events in a single C++ iteration (nb_page_loop) 4 1 1 ! Number of events and max and min iterations 0.000001 ! Accuracy (ignored because max iterations = min iterations) 0 ! Grid Adjustment 0=none, 2=adjust (NB if = 0, ftn26 will still be used if present) 1 ! Suppress Amplitude 1=yes (i.e. use MadEvent single-diagram enhancement) 0 ! Helicity Sum/event 0=exact 1 ! Channel number (1-N) for single-diagram enhancement multi-channel (NB NOT IGNORED even if suppress amplitude is 0!) gives ... ihe 2 matrix1_ihel 1933.18258519 matrix1_sumhel 1933.18258519 ihe 3 matrix1_ihel 0.99461559 matrix1_sumhel 1934.17720077 ihe 5 matrix1_ihel 7.80045209 matrix1_sumhel 1941.97765287 ihe 6 matrix1_ihel 0.00006202 matrix1_sumhel 1941.97771489 ihe 7 matrix1_ihel 0.00006202 matrix1_sumhel 1941.97777691 ihe 8 matrix1_ihel 0.00000007 matrix1_sumhel 1941.97777697 ihe 9 matrix1_ihel 0.00000007 matrix1_sumhel 1941.97777704 ihe 10 matrix1_ihel 0.00006202 matrix1_sumhel 1941.97783906 ihe 11 matrix1_ihel 0.00006202 matrix1_sumhel 1941.97790108 ihe 12 matrix1_ihel 7.80045209 matrix1_sumhel 1949.77835317 ihe 14 matrix1_ihel 0.99461559 matrix1_sumhel 1950.77296876 ihe 15 matrix1_ihel 1933.18258519 matrix1_sumhel 3883.95555395 ch 1 amp2ch 0.75549842 amp2sumch 0.75549842 getchcut 1.00000000 ch 2 amp2ch 765.21526350 amp2sumch 765.97076192 getchcut 1.00000000 ch 3 amp2ch 0.39661006 amp2sumch 766.36737199 getchcut 1.00000000 ForCh1 0.01495653 = ans 3883.95555395 /iden 256* amp2ch 0.75549842/amp2sum 766.36737199 ForCh1 0.01495653 =ForCh0(ans/iden) 15.17170138* amp2ch 0.75549842/amp2sum 766.36737199 ihe 2 matrix1_ihel 145.09951116 matrix1_sumhel 145.09951116 ihe 3 matrix1_ihel 6.61872017 matrix1_sumhel 151.71823133 ihe 5 matrix1_ihel 0.86306130 matrix1_sumhel 152.58129263 ihe 6 matrix1_ihel 0.00070216 matrix1_sumhel 152.58199479 ihe 7 matrix1_ihel 0.00070216 matrix1_sumhel 152.58269696 ihe 8 matrix1_ihel 0.00000170 matrix1_sumhel 152.58269865 ihe 9 matrix1_ihel 0.00000170 matrix1_sumhel 152.58270035 ihe 10 matrix1_ihel 0.00070216 matrix1_sumhel 152.58340251 ihe 11 matrix1_ihel 0.00070216 matrix1_sumhel 152.58410468 ihe 12 matrix1_ihel 0.86306130 matrix1_sumhel 153.44716598 ihe 14 matrix1_ihel 6.61872017 matrix1_sumhel 160.06588615 ihe 15 matrix1_ihel 145.09951116 matrix1_sumhel 305.16539731 ch 1 amp2ch 5.02520604 amp2sumch 5.02520604 getchcut 1.00000000 ch 2 amp2ch 81.13881210 amp2sumch 86.16401814 getchcut 1.00000000 ch 3 amp2ch 3.72070900 amp2sumch 89.88472714 getchcut 1.00000000 ForCh1 0.06664434 = ans 305.16539731 /iden 256* amp2ch 5.02520604/amp2sum 89.88472714 ForCh1 0.06664434 =ForCh0(ans/iden) 1.19205233* amp2ch 5.02520604/amp2sum 89.88472714 ihe 2 matrix1_ihel 29.83550845 matrix1_sumhel 29.83550845 ihe 3 matrix1_ihel 1.18628332 matrix1_sumhel 31.02179178 ihe 5 matrix1_ihel 11.22829566 matrix1_sumhel 42.25008743 ihe 6 matrix1_ihel 4.55355964 matrix1_sumhel 46.80364707 ihe 7 matrix1_ihel 4.55355964 matrix1_sumhel 51.35720670 ihe 8 matrix1_ihel 6.00718471 matrix1_sumhel 57.36439141 ihe 9 matrix1_ihel 6.00718471 matrix1_sumhel 63.37157613 ihe 10 matrix1_ihel 4.55355964 matrix1_sumhel 67.92513576 ihe 11 matrix1_ihel 4.55355964 matrix1_sumhel 72.47869540 ihe 12 matrix1_ihel 11.22829566 matrix1_sumhel 83.70699105 ihe 14 matrix1_ihel 1.18628332 matrix1_sumhel 84.89327437 ihe 15 matrix1_ihel 29.83550845 matrix1_sumhel 114.72878283 ch 1 amp2ch 0.10680425 amp2sumch 0.10680425 getchcut 1.00000000 ch 2 amp2ch 15.89822055 amp2sumch 16.00502480 getchcut 1.00000000 ch 3 amp2ch 9.01826666 amp2sumch 25.02329146 getchcut 1.00000000 ForCh1 0.00191283 = ans 114.72878283 /iden 256* amp2ch 0.10680425/amp2sum 25.02329146 ForCh1 0.00191283 =ForCh0(ans/iden) 0.44815931* amp2ch 0.10680425/amp2sum 25.02329146 ihe 2 matrix1_ihel 511.63367052 matrix1_sumhel 511.63367052 ihe 3 matrix1_ihel 2.81068992 matrix1_sumhel 514.44436044 ihe 5 matrix1_ihel 12.19377592 matrix1_sumhel 526.63813636 ihe 6 matrix1_ihel 0.00563681 matrix1_sumhel 526.64377318 ihe 7 matrix1_ihel 0.00563681 matrix1_sumhel 526.64940999 ihe 8 matrix1_ihel 0.00003946 matrix1_sumhel 526.64944945 ihe 9 matrix1_ihel 0.00003946 matrix1_sumhel 526.64948891 ihe 10 matrix1_ihel 0.00563681 matrix1_sumhel 526.65512572 ihe 11 matrix1_ihel 0.00563681 matrix1_sumhel 526.66076254 ihe 12 matrix1_ihel 12.19377592 matrix1_sumhel 538.85453846 ihe 14 matrix1_ihel 2.81068992 matrix1_sumhel 541.66522838 ihe 15 matrix1_ihel 511.63367052 matrix1_sumhel 1053.29889890 ch 1 amp2ch 2.21715024 amp2sumch 2.21715024 getchcut 1.00000000 ch 2 amp2ch 229.54191357 amp2sumch 231.75906382 getchcut 1.00000000 ch 3 amp2ch 1.30668433 amp2sumch 233.06574815 getchcut 1.00000000 ForCh1 0.03914068 = ans 1053.29889890 /iden 256* amp2ch 2.21715024/amp2sum 233.06574815 ForCh1 0.03914068 =ForCh0(ans/iden) 4.11444882* amp2ch 2.21715024/amp2sum 233.06574815 Event #0 MEch1=-nan = MEch0 15.1706 * num 0 / den 0 Event #1 MEch1=-nan = MEch0 1.192 * num 0 / den 0 Event #2 MEch1=-nan = MEch0 0.448159 * num 0 / den 0 Event madgraph5#3 MEch1=-nan = MEch0 4.1142 * num 0 / den 0 Event #0 MEch1=0.0150196 = MEch0 15.1706 * num 0.758692 / den 766.323 Event #1 MEch1=0.0671816 = MEch0 1.192 * num 5.07394 / den 90.0269 Event #2 MEch1=0.0732451 = MEch0 0.448159 * num 7.92101 / den 48.4657 Event madgraph5#3 MEch1=0.0402084 = MEch0 4.1142 * num 2.27946 / den 233.239 Event 1 ForCh1 0.01495653 CppCh1 0.01501956 CppCh0 15.17063351 Event 2 ForCh1 0.06664434 CppCh1 0.06718160 CppCh0 1.19200151 Event 3 ForCh1 0.00191283 CppCh1 0.07324507 CppCh0 0.44815874 Event 4 ForCh1 0.03914068 CppCh1 0.04020841 CppCh0 4.11420001 ... A few observations - why does the cpp loop twice, with the first time printining nans? - the full ME without multichannel, dubbed ForCh0 and CppCh0, are in good agreement (this was known) - luckily the amps and amp sums have the same units... - for these 4 events, amp2sum i.e. denom is 766, 90, 48, 233 in cpp, vs 766, 90, 25, 233 in fortran: why 48 vs 25?! - for these 4 events, amp2 for ch1 i.e. num is 0.8, 5.1, 7.9, 2.3 in cpp, vs 0.8, 5.0, 0.1, 2.2 in ftr: why 7.9 vs 0.1?

Some clear differences start to emerge ForCh1 0.06664434 = ans 305.16539731 /iden 256* amp2ch 5.02520604/amp2sum 89.88472714 ForCh1 0.06664434 =ForCh0(ans/iden) 1.19205233* amp2ch 5.02520604/amp2sum 89.88472714 matrix1 amp2ch1 0.02670106 amp2ch2 4.50704307 amp2ch3 2.04053476 amp2tot 6.57427890 matrix1 amp2ch1 0.02670106 amp2ch2 0.05168202 amp2ch3 0.23119429 amp2tot 6.88385627 matrix1 amp2ch1 0.00000000 amp2ch2 1.44512113 amp2ch3 0.95367341 amp2tot 9.28265080 matrix1 amp2ch1 0.00000000 amp2ch2 0.58605913 amp2ch3 0.38675582 amp2tot 10.25546576 matrix1 amp2ch1 0.00000000 amp2ch2 0.58605913 amp2ch3 0.38675582 amp2tot 11.22828072 matrix1 amp2ch1 0.00000000 amp2ch2 0.77314579 amp2ch3 0.51021922 amp2tot 12.51164573 matrix1 amp2ch1 0.00000000 amp2ch2 0.77314579 amp2ch3 0.51021922 amp2tot 13.79501075 matrix1 amp2ch1 0.00000000 amp2ch2 0.58605913 amp2ch3 0.38675582 amp2tot 14.76782570 matrix1 amp2ch1 0.00000000 amp2ch2 0.58605913 amp2ch3 0.38675582 amp2tot 15.74064066 matrix1 amp2ch1 0.00000000 amp2ch2 1.44512113 amp2ch3 0.95367341 amp2tot 18.13943519 matrix1 amp2ch1 0.02670106 amp2ch2 0.05168202 amp2ch3 0.23119429 amp2tot 18.44901256 matrix1 amp2ch1 0.02670106 amp2ch2 4.50704307 amp2ch3 2.04053476 amp2tot 25.02329146 ch 1 amp2ch 0.10680425 amp2sumch 0.10680425 getchcut 1.00000000 ch 2 amp2ch 15.89822055 amp2sumch 16.00502480 getchcut 1.00000000 ch 3 amp2ch 9.01826666 amp2sumch 25.02329146 getchcut 1.00000000 ForCh1 0.00191283 = ans 114.72878283 /iden 256* amp2ch 0.10680425/amp2sum 25.02329146 ForCh1 0.00191283 =ForCh0(ans/iden) 0.44815931* amp2ch 0.10680425/amp2sum 25.02329146 ch 1 amp2ch 2.21715024 amp2sumch 2.21715024 getchcut 1.00000000 ch 2 amp2ch 229.54191357 amp2sumch 231.75906382 getchcut 1.00000000 ch 3 amp2ch 1.30668433 amp2sumch 233.06574815 getchcut 1.00000000 ForCh1 0.03914068 = ans 1053.29889890 /iden 256* amp2ch 2.21715024/amp2sum 233.06574815 ForCh1 0.03914068 =ForCh0(ans/iden) 4.11444882* amp2ch 2.21715024/amp2sum 233.06574815 Event #0 MEch1=-nan = MEch0 15.1706 * num 0 / den 0 Event #1 MEch1=-nan = MEch0 1.192 * num 0 / den 0 Event #2 MEch1=-nan = MEch0 0.448159 * num 0 / den 0 Event madgraph5#3 MEch1=-nan = MEch0 4.1142 * num 0 / den 0 ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=0.0267011 ievt0=2, diag2, amp2=4.50701, sumamp2(denom)=4.53371 ievt0=2, diag3, amp2=2.04053, sumamp2(denom)=6.57424 ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=8.52779 ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=10.4813 ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=12.4349 ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=14.3884 ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=16.342 ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=18.2955 ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=18.3222 ievt0=2, diag2, amp2=0.0516816, sumamp2(denom)=18.3739 ievt0=2, diag3, amp2=0.231193, sumamp2(denom)=18.6051 ievt0=2, diag1, amp2=0, sumamp2(denom)=18.6051 ievt0=2, diag2, amp2=0.586055, sumamp2(denom)=19.1911 ievt0=2, diag3, amp2=0.386754, sumamp2(denom)=19.5779 These two sequences are the same for the first helicity, completely different for the second one? Fortran matrix1 amp2ch1 0.02670106 amp2ch2 4.50704307 amp2ch3 2.04053476 amp2tot 6.57427890 matrix1 amp2ch1 0.02670106 amp2ch2 0.05168202 amp2ch3 0.23119429 amp2tot 6.88385627 Cpp ievt0=2, diag1, amp2=0.0267011, sumamp2(denom)=0.0267011 ievt0=2, diag2, amp2=4.50701, sumamp2(denom)=4.53371 ievt0=2, diag3, amp2=2.04053, sumamp2(denom)=6.57424 ievt0=2, diag1, amp2=1.95355, sumamp2(denom)=8.52779 ievt0=2, diag2, amp2=1.95354, sumamp2(denom)=10.4813 ievt0=2, diag3, amp2=1.95354, sumamp2(denom)=12.4349

…o_main_tmp Br golden epoch x4 to main tmp

Gpu abstraction

…#845 in log_gqttq_mad_f_inl0_hrd0.txt, the rest as expected STARTED AT Thu May 16 01:24:16 AM CEST 2024 (SM tests) ENDED(1) AT Thu May 16 05:58:45 AM CEST 2024 [Status=0] (BSM tests) ENDED(1) AT Thu May 16 06:07:42 AM CEST 2024 [Status=0] 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt 18 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt 0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt The new issue madgraph5#845 is the following +Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. + +Backtrace for this error: +#0 0x7f2a1a623860 in ??? +#1 0x7f2a1a622a05 in ??? +#2 0x7f2a1a254def in ??? +madgraph5#3 0x7f2a1ae20acc in ??? +madgraph5#4 0x7f2a1acc4575 in ??? +madgraph5#5 0x7f2a1ae1d4c9 in ??? +madgraph5#6 0x7f2a1ae2570d in ??? +madgraph5#7 0x7f2a1ae2afa1 in ??? +madgraph5#8 0x43008b in ??? +madgraph5#9 0x431c10 in ??? +madgraph5#10 0x432d47 in ??? +madgraph5#11 0x433b1e in ??? +madgraph5#12 0x44a921 in ??? +madgraph5#13 0x42ebbf in ??? +madgraph5#14 0x40371e in ??? +madgraph5#15 0x7f2a1a23feaf in ??? +madgraph5#16 0x7f2a1a23ff5f in ??? +madgraph5#17 0x403844 in ??? +madgraph5#18 0xffffffffffffffff in ??? +./madX.sh: line 379: 3004240 Floating point exception(core dumped) $timecmd $cmd < ${tmpin} > ${tmp} +ERROR! ' ./build.512z_f_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_gqttq_x10_cudacpp > /tmp/avalassi/output_gqttq_x10_cudacpp' failed

roiser added the idea Possible new development (may need further discussion) label Aug 12, 2020

jtchilders pushed a commit to jtchilders/madgraph4gpu that referenced this issue Nov 15, 2022

Merge pull request madgraph5#3 from nscottnichols/br_golden_epochX4_t…

2a085fc

…o_main_tmp Br golden epoch x4 to main tmp

valassi pushed a commit to valassi/madgraph4gpu that referenced this issue Jul 13, 2023

Merge pull request madgraph5#3 from Jooorgen/gpu_abstraction

a9a77d2

Gpu abstraction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace helicity loops with individual threads #3

Replace helicity loops with individual threads #3

roiser commented Aug 12, 2020

valassi commented Dec 9, 2020

Replace helicity loops with individual threads #3

Replace helicity loops with individual threads #3

Comments

roiser commented Aug 12, 2020

valassi commented Dec 9, 2020