-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API(4) Move HelAmps to the latest MemoryAccess classes #322
Conversation
…lAmps - none check/gcheck builds (Still to do: testxxx port and SIMD vector types)
ccache /cvmfs/sft.cern.ch/lcg/releases/gcc/10.2.0-c44b3/x86_64-centos7/bin/g++ -O3 -std=c++17 -I. -I../../src -I../../../../../tools -I../../../../../tools -I../../../../../test/googletest/googletest/include -DUSE_NVTX -Wall -Wshadow -Wextra -ffast-math -DMGONGPU_FPTYPE_DOUBLE -I/usr/local/cuda-11.1/include/ -c testxxx.cc -o testxxx.o In file included from testxxx.cc:5: ../../src/HelAmps_sm.h:89:8: warning: inline function ‘void MG5_sm::imzxxx(const fptype_sv*, int, int, cxtype_sv*, int) [with M_ACCESS = KernelAccessMomenta<false>; fptype_sv = double; cxtype_sv = std::complex<double>]’ used but never defined 89 | void imzxxx( const fptype_sv* momenta, /cvmfs/sft.cern.ch/lcg/releases/binutils/2.34-990b2/x86_64-centos7/bin/ld: ./testxxx.o: in function `SIGMA_SM_EPEM_MUPMUM_CPU_testxxx_Test::TestBody()': testxxx.cc:(.text+0xdd76): undefined reference to `void MG5_sm::imzxxx<KernelAccessMomenta<false> >(double const*, int, int, std::complex<double>*, int)'
…t test fails at runtime testxxx.cc:187: Failure The difference between cxreal( wf[iw6] ) and expReal is 1000, which exceeds std::abs( expReal * toleranceXXXs ), where cxreal( wf[iw6] ) evaluates to -500, expReal evaluates to 500, and std::abs( expReal * toleranceXXXs ) evaluates to 4.9999999999999999e-13. itest=12: imzxxx#1 against ixxxxx
…issing (was always testing event 0)
…akes fptype* as input, not fptype_sv*
… const - check.exe ok, runTest segfaults
…ort unaligned/arbitrary arrays)
…gfault), check is slow 4.40E6 512y?
(NB runTest still segfaults if the checks are skipped)
…gM*neppM, not ievt! - runTest now ok
… ok, cuda perf ok, SIMD 10% slower... (NB cuda performance is definitely not affected, one test even was 2% faster than the previous reference)
…2022 - eemumu performance fluctuations (Note in particular a 2% fluctuation in the cuda results)
…cover apirambo performance (Note there are still reproducible 2% fluctuations in both cuda and c++, but today I get them in apirambo too)
…events have same initial momenta (Hence the MEs are all 0 and the tests fail)
Revert "[apihel] experiment with noinline keyword - gives build warnings, performance slightly worse than expected" This reverts commit 63ca571. Revert "[apihel] attempt another fix, define INLINE as inline in all cases - builds, but affects performance" This reverts commit 4f9089e. Revert "[apihel] first fix to move XXX function implementation to cc file - testxxx build still fails" This reverts commit f594600. Revert "[apihel] try again to move XXX function implementation to cc file - testxxx build fails" This reverts commit a570168.
…eemumu auto - complete CPPProcess.cc
…eemumu auto - complete CPPProcess.h
…re identical (NB moved XXX function implementation earlier on in HelAmps.h manual too)
… bit faster as observed for double
…hst random) - all ok
…lightly faster, 512y/z slightly slower
…er, 512y/z MUCH slower??
… all looks the same as before?)
…ote ME values have changed?!
…th hrd1 - all ok, ~same perf
This is now complete. This is where I stripped off what was initially "the second part of apirambo" PR #321. Essentially in apirambo I have the new MemoryAccessMomenta for rambo, the old MemoryAcccessMomenta for MEs in HelAmps. This new PR is about porting the new MemoryAccessMomenta to HelAmps, too. Note that new BufferMEs and MemoryAccessMEs will be created in the upcoming apimes PR. With respect to the previous WIP
With respect to the other points that I mentioned to be done
In summary, pending:
|
This completes the "API4" step described in #323 |
This was presented at the meeting today |
I have decided to strip off what was initially "the second part of apirambo" PR #321.
Essentially in apirambo I have the new MemoryAccess for rambo, the old MemoryAcccess for MEs. This new PR is about porting the new MemoryAccess to MEs and specifically HELAmps, too.
This is WIP
What remans to be done