Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for iconfig-channel mappings in coloramps.h (obsolete, will be closed) #873

Closed
wants to merge 169 commits into from

Conversation

valassi
Copy link
Member

@valassi valassi commented Jun 28, 2024

This is a clean WIP MR for iconfig-color mapping issues in coloramps.h. It starts from the current upstream/master, which already includes all relevant changes for rotxxx crashes.

It includes all of Olivier's commits in PR #852, as well as the relevant commits from my PR #853. It also supersedes the tests in WIP PR #870.

It is meant to address #856 (lhe color mismatches in ggttgg) and related issues, like possibly #845.

It is NOT meant to address #826 (no cross section in susy) and has most likely nothing to do with 826 (or susy), even if it includes fragments from 'fix_826" and 'gpucpp_826' and 'susy' branches.

…d for push/manual, disabled for PRs)

Note: the FPE crashes in madgraph5#783 are not shown here because they need FPTYPE=f builds.
I will add those in a more complex workflow with one codegen job and several build/test jobs.
…t into two separate jobs, and add a codegen cache (which is really a compulsory build artifact)
…eat the build/test jobs twice (for FPTYPE=d,f)

This must be cleaned up
- the cache cleanup job must be split up (codegen cache cleanup once, build cache cleanup once per build type)
- the Process+fptype tag must become a more general build tag for caches (eventually add inl, hrdcod)
…also affected by madgraph5#696

[avalassi@itscrd80 gcc11.2/cvmfs] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/nobm_gg_tt.sa/SubProcesses/P1_Sigma_loop_sm_no_b_mass_gg_ttx> make HRDCOD=1
OMPFLAGS=-fopenmp
AVX=512y
FPTYPE=d
HELINL=0
HRDCOD=1
RNDGEN=hasCurand
Building in BUILDDIR=. for tag=512y_d_inl0_hrd1_hasCurand (USEBUILDDIR is not set)
make -C ../../src  -f cudacpp_src.mk
make[1]: Entering directory '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/nobm_gg_tt.sa/src'
AVX=512y
ccache /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos8/bin/g++  -O3  -std=c++17 -I.  -fPIC -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -c Parameters_loop_sm_no_b_mass.cc -o Parameters_loop_sm_no_b_mass.o
In file included from Parameters_loop_sm_no_b_mass.cc:15:
Parameters_loop_sm_no_b_mass.h: In function ‘const Parameters_loop_sm_no_b_mass_dependentCouplings::DependentCouplings_sv Parameters_loop_sm_no_b_mass_dependentCouplings::computeDependentCouplings_fromG(const fptype_sv&)’:
Parameters_loop_sm_no_b_mass.h:291:46: error: ‘COND’ was not declared in this scope
  291 |       const fptype_sv mdl_GWcft_UV_t_1EPS_ = COND( mdl_MT, 0., -( ( mdl_G__exp__2 ) / ( 2. * 48. * ( ( M_PI ) * ( M_PI ) ) ) ) * 4. * mdl_TF );
      |                                              ^~~~
Parameters_loop_sm_no_b_mass.h:300:138: error: ‘reglog’ was not declared in this scope
  300 |       const fptype_sv mdl_G_UVt_FIN_ = COND( mdl_MT, 0., -( ( mdl_G__exp__2 ) / ( 2. * 48. * ( ( M_PI ) * ( M_PI ) ) ) ) * 4. * mdl_TF * reglog( mdl_MT__exp__2 / mdl_MU_R__exp__2 ) );
      |                                                                                                                                          ^~~~~~
make[1]: *** [cudacpp_src.mk:241: Parameters_loop_sm_no_b_mass.o] Error 1
make[1]: Leaving directory '/data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/nobm_gg_tt.sa/src'
make: *** [makefile:520: ../../lib/libmg5amc_common.so] Error 2
…xt (for debugging madgraph5#701)

  cp dump_SIGMA_SM_NO_B_MASS_GD_TTXWMU_CPU_MadgraphTest.CompareMomentaAndME_0.txt ../../../CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/test/ref/dump_CPUTest.Sigma_sm_no_b_mass_gd_ttxwmu.txt

This is necessary because runTest was failing otherwise
 pushd nobm_pp_ttW.mad/SubProcesses/P1_gd_ttxwmu
 make cleanall; HRDCOD=1 make -j
 ./runTest.exe
Before this succeeds however, it is necessary to rebuild
…p_ttW: results have changed and seem more correct...

INFO: The application is built for skylake-avx512 (AVX512VL) and the host supports it
[  FAILED  ] SIGMA_SM_NO_B_MASS_GD_TTXWMU_CPU/MadgraphTest.CompareMomentaAndME/0, where GetParam() = 0x7ac410 (10 ms)
[----------] 1 test from SIGMA_SM_NO_B_MASS_GD_TTXWMU_CPU/MadgraphTest (10 ms total)

[----------] 1 test from SIGMA_SM_NO_B_MASS_GD_TTXWMU_GPU/MadgraphTest
[ RUN      ] SIGMA_SM_NO_B_MASS_GD_TTXWMU_GPU/MadgraphTest.CompareMomentaAndME/0
INFO: Opening reference file ../../test/ref/dump_CPUTest.Sigma_sm_no_b_mass_gd_ttxwmu.txt
MadgraphTest.h:299: Failure
The difference between testDriver->getMatrixElement( ievt ) and referenceData[iiter].MEs[ievt] is 1.4553189634594381e-10, which exceeds toleranceMEs * referenceData[iiter].MEs[ievt], where
testDriver->getMatrixElement( ievt ) evaluates to 1.4553189634594381e-10,
referenceData[iiter].MEs[ievt] evaluates to 0, and
toleranceMEs * referenceData[iiter].MEs[ievt] evaluates to 0.
Google Test trace:
MadgraphTest.h:278: In comparing event 0 from iteration 0
   0  7.500000000000000e+02  0.000000000000000e+00  0.000000000000000e+00  7.500000000000000e+02
ref0  7.500000000000000e+02  0.000000000000000e+00  0.000000000000000e+00  7.500000000000000e+02

   1  7.500000000000000e+02  0.000000000000000e+00  0.000000000000000e+00 -7.500000000000000e+02
ref1  7.500000000000000e+02  0.000000000000000e+00  0.000000000000000e+00 -7.500000000000000e+02

   2  2.045233209356228e+02  6.877986897204741e+01 -1.905381248013139e+02  2.818406336784427e+01
ref2  2.045233209356227e+02  6.877986897204741e+01 -1.905381248013139e+02  2.818406336784428e+01

   3  5.474933604313479e+02 -4.596225360107567e+02  3.030720946352406e+01  2.959350894402092e+02
ref3  5.474933604313477e+02 -4.596225360107564e+02  3.030720946352398e+01  2.959350894402091e+02

   4  5.014688717565998e+02  4.188441856206845e+02  2.572754903817052e+02 -9.924666020293013e+01
ref4  5.014688717565996e+02  4.188441856206844e+02  2.572754903817050e+02 -9.924666020293004e+01

   5  2.465144468764298e+02 -2.800151858197540e+01 -9.704457504391526e+01 -2.248724926051235e+02
ref5  2.465144468764297e+02 -2.800151858197538e+01 -9.704457504391526e+01 -2.248724926051234e+02

  ME  1.455318963459438e-10
r.ME  0.000000000000000e+00

[  FAILED  ] SIGMA_SM_NO_B_MASS_GD_TTXWMU_GPU/MadgraphTest.CompareMomentaAndME/0, where GetParam() = 0x7c5f20 (37 ms)
[----------] 1 test from SIGMA_SM_NO_B_MASS_GD_TTXWMU_GPU/MadgraphTest (37 ms total)
CUDACPP_RUNTEST_DUMPEVENTS=1 ./runTest.exe ; mv dump_CPUTest* ../../../CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/test/ref/
…h, add some debug printouts about comparison of generated code
….yml, disable on:push triggers to avoid launching two jobs instead of one
…h, fix a bash bug and disable comparisons to the existing repo
…bleFPE (which already exists in testsuite_oneprocess)
…false (e.g. do not stop double jobs if float has failed)
Fix conflicts:
	.github/workflows/testsuite_allprocesses.yml
	.github/workflows/testsuite_oneprocess.yml
	epochX/cudacpp/CODEGEN/generateAndCompare.sh
…ot channelId (and note that iconfig=1 is ok)
… test a different iconfig

In particular: the following triggers a SIGFPE reported in madgraph5#855 (crash in rotxxx that can be fixed adding volatile?)
  ./tmad/madX.sh -ggttgg -iconfig 104 -makeclean

This also triggers a similar SIGFPE (initially reported in madgraph5#826)
  ./tmad/madX.sh -susyggt1t1 -iconfig 2 -makeclean
…SIGFPE madgraph5#855, and add volatile in aloha_functions.f to try to fix it

The SIGFPE crash madgraph5#855 does seem to disappear in
  ./tmad/madX.sh -ggttgg -iconfig 104 -makeclean
However, there is now a DIFFERENT issue, an lhe file mismatch between fortran and cpp (madgraph5#856)
This is probably due to the iconfig/channel mapping issue reported by Olivier in madgraph5#852
…ebug SIGFPE madgraph5#855, and add volatile in aloha_functions.f to try to fix it

The SIGFPE crash madgraph5#855 does seem to disappear in
  ./tmad/madX.sh -susyggt1t1 -iconfig 2 -makeclean
Then no cross section is printed also for this iconfig (same as madgraph5#826 for iconfig 1), but this is a DIFFERENT issue
…: note that SIGFPE madgraph5#855 is still fixed because volatile has been added
…dgraph5#855 in rotxxx

The issue was observed and fixed in gg_ttgg (iconfig 104) and susy_gg_t1t1 (iconfig 2), the backport as usual is from gg_tt

Note that aloha_functions.f is now added to the list of files to include when preparing patch.common

./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
git diff --no-ext-diff -R gg_tt.mad/Source/makefile gg_tt.mad/Source/dsample.f gg_tt.mad/Source/DHELAS/aloha_functions.f gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/bin/internal/banner.py gg_tt.mad/bin/internal/gen_ximprove.py gg_tt.mad/bin/internal/madevent_interface.py >> CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad
@valassi valassi requested a review from oliviermattelaer June 28, 2024 16:59
@valassi
Copy link
Member Author

valassi commented Jun 28, 2024

Hi @oliviermattelaer I think that this is ready for review. Can you please have a look? Thanks!

(I still need to run some manual tests before merging, but the PR is ready for review as nothing will really change functionally).

@valassi valassi changed the title WIP: Fixes for iconfig-channel mappings in coloramps.h Fixes for iconfig-channel mappings in coloramps.h Jun 28, 2024
@roiser
Copy link
Member

roiser commented Jun 28, 2024

* no cross section in susy_gg_t1t1 (the infamous [No cross section in SUSY gg_t1t1 log file #826](https://github.com/madgraph5/madgraph4gpu/issues/826), stilll there)

pls note I'm looking into this and I think its understood, will reply later in #826

@valassi
Copy link
Member Author

valassi commented Jun 28, 2024

pls note I'm looking into this and I think its understood, will reply later in #826

Thanks @roiser very good. If the issue (no cross section in susy_gg_t1t1) is actually caused by couplings, maybe discuss the technicalities in #862 (or as you want, I will add a note there if this is the case).

valassi added 4 commits June 29, 2024 11:09
…aph5#856 in tmad)

STARTED  AT Fri Jun 28 06:58:24 PM CEST 2024
./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean
ENDED(1) AT Fri Jun 28 08:31:16 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean
ENDED(2) AT Fri Jun 28 08:46:49 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean
ENDED(3) AT Fri Jun 28 08:54:47 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst
ENDED(4) AT Fri Jun 28 08:57:29 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst
ENDED(5) AT Fri Jun 28 09:00:08 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common
ENDED(6) AT Fri Jun 28 09:02:53 PM CEST 2024 [Status=0]
./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean
ENDED(7) AT Fri Jun 28 09:14:39 PM CEST 2024 [Status=0]
…n heft madgraph5#833, susy madgraph5#826 and also gqttq madgraph5#845 - but ggttgg madgraph5#856 is fixed)

Note two points:
- gqttq madgraph5#845 is normally intermittent, so it is interesting that it showed up here (even without OMP)
- the tmad CI also shows pptt012j madgraph5#872, but I am not running pptt012j tests in the tmad suite yet

STARTED  AT Fri Jun 28 09:14:39 PM CEST 2024
(SM tests)
ENDED(1) AT Sat Jun 29 01:37:39 AM CEST 2024 [Status=0]
(BSM tests)
ENDED(1) AT Sat Jun 29 01:47:20 AM CEST 2024 [Status=0]

24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
16 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt
1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt
0 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt
@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

I have completed also all manual tests - this is ready to be merged

@oliviermattelaer oliviermattelaer changed the base branch from master to fix_826 June 29, 2024 12:40
@oliviermattelaer
Copy link
Member

Hi Andrea,

Changing the base since I need to see what you did change compare to my fix_826 branch.
Otherwise this is impossible to review since it includes too many things

@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

Changing the base since I need to see what you did change compare to my fix_826 branch.

Hi Olivier, this is causing conflicts. And fix_826 itself has conflicts to the latest master now I guess.

I spent quite some effort fixing conflicts to make sure this can be merged into master. I will not do this again, sorry.

But if it helps you to review, ok. However when/if we decide to merge, this must merge into master, not into fix_826

Thanks Andrea

@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

PS Please FIRST review mg5amcnlo/mg5amcnlo#115. This #873 depends on that.

@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

Anyway, apart from logs, test scripts etc, the only relevant changes are the MG5AMC upgrade and this

git diff --no-ext-diff origin/color upstream/fix_826 CODEGEN/ > diff
...
diff --git a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
index b90ef84b4..fc56de664 100644
--- a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
+++ b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
@@ -157,7 +157,7 @@ index 4fbb8e6ba..f9e2335de 100644
        END
  
 diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
-index 1124a9164..27a6e4674 100644
+index 71fbf2b25..0f1d199fc 100644
 --- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
 +++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f
 @@ -74,13 +74,77 @@ c      common/to_colstats/ncols,ncolflow,ncolalt,ic
@@ -239,7 +239,7 @@ index 1124a9164..27a6e4674 100644
  c
  c     Read process number
  c
-@@ -208,8 +272,33 @@ c      call sample_result(xsec,xerr)
+@@ -207,8 +271,33 @@ c      call sample_result(xsec,xerr)
  c      write(*,*) 'Final xsec: ',xsec
  
        rewind(lun)
@@ -274,7 +274,7 @@ index 1124a9164..27a6e4674 100644
        end
  
  c     $B$ get_user_params $B$ ! tag for MadWeight
-@@ -387,7 +476,7 @@ c
+@@ -386,7 +475,7 @@ c
        fopened=.false.
        tempname=filename 	 
        fine=index(tempname,' ') 	 
diff --git a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/coloramps.h b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/coloramps.h
index c8657394c..3d03a6063 100644
--- a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/coloramps.h
+++ b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/coloramps.h
@@ -1,38 +1,41 @@
 // Copyright (C) 2020-2024 CERN and UCLouvain.
 // Licensed under the GNU Lesser General Public License (version 3 or later).
 // Created by: A. Valassi (Dec 2022) for the MG5aMC CUDACPP plugin.
-// Further modified by: O. Mattelaer, A. Valassi (2022-2024) for the MG5aMC CUDACPP plugin.
+// Further modified by: A. Valassi (2022-2024) for the MG5aMC CUDACPP plugin.
 
 #ifndef COLORAMPS_H
 #define COLORAMPS_H 1
 
-namespace mgOnGpu /* clang-format off */
+#include <map>
+
+namespace mgOnGpu
 {
   // Summary of numbering and indexing conventions for the relevant concepts (see issue #826 and PR #852)
   // - Diagram number (no variable) in [1, N_diagrams]: all values are allowed (N_diagrams distinct values)
   //   => this number is displayed for information before each block of code in CPPProcess.cc
-  // - Channel number ("channelId" in C, CHANNEL_ID in F) in [1, N_diagrams]: not all values are allowed (N_config <= N_diagrams distinct values)
-  //   => this number (with F indexing as in ps/pdf output) is passed around as an API argument between cudacpp functions
-  //   Note: the old API passes around a single CHANNEL_ID (and uses CHANNEL_ID=0 to indicate no-multichannel mode, but this is not used in coloramps.h),
-  //   while the new API passes around an array of CHANNEL_ID's (and uses a NULL array pointer to indicate no-multichannel mode)
-  // - Channel number in C indexing: "channelIdC" = channelID - 1
-  //   => this number (with C indexing) is used as the index of the channelIdC_to_iconfig array below
-  // - Config number ("iconfig" in C, ICONFIG in F) in [1, N_config]: all values are allowed (N_config <= N_diagrams distinct values)
-  // - Config number in C indexing: "iconfigC" = iconfig - 1
-  //   => this number (with C indexing) is used as the index of the icolamp array below
-
-  // Map channelIdC (in C indexing, i.e. channelId-1) to iconfig (in F indexing)
-  // Note: iconfig=0 indicates invalid values, i.e. channels/diagrams with no single-diagram enhancement in the MadEvent sampling algorithm (presence of 4-point interaction?)
-  // This array has N_diagrams elements, but only N_config <= N_diagrams valid (non-zero) values
-  __device__ constexpr int channelIdC_to_iconfig[%(nb_diag)i] = { // note: a trailing comma in the initializer list is allowed
-%(channelc2iconfig_lines)s
+  // - Channel number (CHANNEL_ID) in [0, N_diagrams]: not all values are allowed (N_config <= N_diagrams distinct values)
+  //   => this number (with indexing like ps/pdf output) is passed around as an API argument between cudacpp functions
+  //   0 is allowed to fallback to no multi-channel mode.
+  // - Channel number in C indexing: "IconfiC", this is the equivalent of the Fortran iconfig
+  //   iconfigC = iconfig -1
+  //   provides a continuous index [0, N_config-1] for array
+  //  iconfigC = ChannelId_to_iconfigC[channelId]
+  //NOTE: All those ordering are event by event specific (with the intent to have those fix within a vector size/wrap   
+  
+  // Map channelId to iconfigC
+  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
+  // unvalid values are set to -1
+  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
+    __device__ constexpr int channelId_to_iconfigC[%(nb_diag_plus_one)i] = {
+     0, // channelId=0: This value means not multi-channel, color will be wrong anyway -> pick the first
+%(diag_to_channel)s
   };
 
   // Map iconfigC (in C indexing, i.e. iconfig-1) to the set of allowed colors
-  // This array has N_config <= N_diagrams elements
-  __device__ constexpr bool icolamp[%(nb_channel)s][%(nb_color)s] = { // note: a trailing comma in the initializer list is allowed
-%(icolamp_lines)s
-  }; /* clang-format on */
+  // This array has N_config <= N_diagrams elements    
+    __device__ constexpr bool icolamp[%(nb_channel)s][%(nb_color)s] = {
+%(is_LC)s
+  };
 
 }
 #endif // COLORAMPS_H
diff --git a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc
index c3c7fd466..2ec052e3a 100644
--- a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc
+++ b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc
@@ -4,7 +4,7 @@
 ! Copyright (C) 2020-2024 CERN and UCLouvain.
 ! Licensed under the GNU Lesser General Public License (version 3 or later).
 ! Modified by: A. Valassi (Sep 2021) for the MG5aMC CUDACPP plugin.
-! Further modified by: O. Mattelaer, J. Teig, A. Valassi (2021-2024) for the MG5aMC CUDACPP plugin.
+! Further modified by: J. Teig, A. Valassi (2021-2024) for the MG5aMC CUDACPP plugin.
 !==========================================================================
 
 #include "GpuAbstraction.h"
@@ -72,9 +72,7 @@
     // Event-by-event random choice of color #402
     if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
     {
-      const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
-      const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
-      const unsigned int iconfigC = iconfig - 1;                               // icolamp in coloramps.h uses the C array indexing starting at 0
+      const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
       fptype targetamp[ncolor] = { 0 };
       for( int icolC = 0; icolC < ncolor; icolC++ )
       {
@@ -117,7 +115,7 @@
     // - firstprivate: give each thread its own copy, and initialise with value from outside
 #define _OMPLIST0 allcouplings, allMEs, allmomenta, allrndcol, allrndhel, allselcol, allselhel, cGoodHel, cNGoodHel, npagV2
 #ifdef MGONGPU_SUPPORTS_MULTICHANNEL
-#define _OMPLIST1 , allDenominators, allNumerators, channelId, mgOnGpu::icolamp, mgOnGpu::channelIdC_to_iconfig
+#define _OMPLIST1 , allDenominators, allNumerators, channelId, mgOnGpu::icolamp, mgOnGpu::channelId_to_iconfigC
 #else
 #define _OMPLIST1
 #endif
@@ -189,9 +187,7 @@
       // Event-by-event random choice of color #402
       if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
       {
-        const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
-        const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
-        const unsigned int iconfigC = iconfig - 1;                               // icolamp in coloramps.h uses the C array indexing starting at 0
+        const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
         fptype_sv targetamp[ncolor] = { 0 };
         for( int icolC = 0; icolC < ncolor; icolC++ )
         {
@@ -202,6 +198,7 @@
           if( mgOnGpu::icolamp[iconfigC][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
         }
 #if defined MGONGPU_CPPSIMD and defined MGONGPU_FPTYPE_DOUBLE and defined MGONGPU_FPTYPE2_FLOAT
+        const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
         fptype_sv targetamp2[ncolor] = { 0 };
         for( int icolC = 0; icolC < ncolor; icolC++ )
         {
diff --git a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py
index 5a275b566..a735a80dc 100644
--- a/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py
+++ b/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py
@@ -1548,28 +1548,27 @@ class PLUGIN_OneProcessExporter(PLUGIN_export_cpp.OneProcessExporterGPU):
         # will be smaller than the true number of diagram. This is fine for color
         # but maybe not for something else.
         nb_diag = max(config[0] for config in config_subproc_map)
-        import math
-        ndigits = str(int(math.log10(nb_diag))+1)
         # Output which diagrams correspond ot a channel to get information for valid color
         lines = []
+        # Note: line for index 0 (no multi-channel is hardcoded in the template, so here we fill the array from index 1)
         for diag in range(1, nb_diag+1):
-            channelidf = diag
-            channelidc = channelidf - 1 # C convention 
             if diag in diag_to_iconfig:
                 iconfigf = diag_to_iconfig[diag]
-                iconfigftxt = '%i'%iconfigf
+                iconfigc = iconfigf - 1 # C convention 
+                text = "     %(iconfigc)i, // channelId=%(diag)i (diagram=%(diag)i) --> iconfig=%(iconfigf)i (f77 conv) and iconfigC=%(iconfigc)i (c conv)" 
             else:
-                iconfigf = 0
-                iconfigftxt = 'None'
-            text = '    %(iconfigf)NNNNi, // CHANNEL_ID=%(channelidf)-NNNNi i.e. DIAGRAM=%(diag)-NNNNi --> ICONFIG=%(iconfigftxt)s'.replace('NNNN',ndigits)
-            lines.append(text % {'diag':diag, 'channelidf':channelidf, 'iconfigf':iconfigf, 'iconfigftxt':iconfigftxt})
-        replace_dict['channelc2iconfig_lines'] = '\n'.join(lines)
+                iconfigc = -1
+                iconfigf = -1
+                text = "     -1, // channelId=%(diag)i (diagram=%(diag)i): Not consider as a channel of integration (presence of 4 point interaction?)" 
+            lines.append(text % {'diag': diag, 'iconfigc':  iconfigc, 'iconfigf':iconfigf})  
+
+        replace_dict['diag_to_channel'] = '\n'.join(lines)
 
         if self.include_multi_channel: # NB unnecessary as edit_coloramps is not called otherwise...
             multi_channel = self.get_multi_channel_dictionary(self.matrix_elements[0].get('diagrams'), self.include_multi_channel)
-            replace_dict['icolamp_lines'] = self.get_icolamp_lines(multi_channel, self.matrix_elements[0], 1)
+            replace_dict['is_LC'] = self.get_icolamp_lines(multi_channel, self.matrix_elements[0], 1)
             replace_dict['nb_channel'] = len(multi_channel)
-            replace_dict['nb_diag'] = max(config[0] for config in config_subproc_map)
+            replace_dict['nb_diag_plus_one'] = max(config[0] for config in config_subproc_map)+1
             replace_dict['nb_color'] = max(1,len(self.matrix_elements[0].get('color_basis')))
             
             misc.sprint(multi_channel)
@@ -1577,14 +1576,18 @@ class PLUGIN_OneProcessExporter(PLUGIN_export_cpp.OneProcessExporterGPU):
             #raise Exception
             
             # AV extra formatting (e.g. gg_tt was "{{true,true};,{true,false};,{false,true};};")
-            ###misc.sprint(replace_dict['icolamp_lines'])
-            split = replace_dict['icolamp_lines'].replace('{{','{').replace('};};','}').split(';,')
-            text=', // ICONFIG=%-NNNNi <-- CHANNEL_ID=%i'.replace('NNNN',ndigits)
-            for iconfigc in range(len(split)): 
-                ###misc.sprint(split[iconfigc])
-                split[iconfigc] = '    ' + split[iconfigc].replace(',',', ').replace('true',' true').replace('{','{ ').replace('}',' }')
-                split[iconfigc] += text % (iconfigc+1, iconfig_to_diag[iconfigc+1])
-            replace_dict['icolamp_lines'] = '\n'.join(split)
+            split = replace_dict['is_LC'].split(';,') 
+            misc.sprint(replace_dict['is_LC'])
+            for i in range(len(split)): 
+                misc.sprint(split[i])
+                split[i] = '    ' + split[i].replace(',',', ').replace('{{', '{')
+                misc.sprint(split[i])
+                if '};};' in split[i]:
+                   split[i] = split[i][:-4] + '}, // iconfigC=%i, diag=%i' % (i, iconfig_to_diag[i+1]) 
+                elif 'false' in split[i] or 'true' in split[i]:
+                    split[i] += ', // iconfigC=%i, diag=%i' % (i, iconfig_to_diag[i+1])
+                misc.sprint(split[i])
+            replace_dict['is_LC'] = '\n'.join(split)
             ff.write(template % replace_dict)
         ff.close()
 
diff --git a/epochX/cudacpp/CODEGEN/checkFormatting.sh b/epochX/cudacpp/CODEGEN/checkFormatting.sh
index a6db39255..a016a7958 100755
--- a/epochX/cudacpp/CODEGEN/checkFormatting.sh
+++ b/epochX/cudacpp/CODEGEN/checkFormatting.sh
@@ -37,7 +37,7 @@ function checkProcdir()
   cd $TOPDIR
   if [ ! -d $procdir ]; then echo "ERROR! Directory not found $TOPDIR/$procdir"; exit 1; fi
   # Define the list of files to be checked
-  files=$(\ls $procdir/src/*.cc $procdir/src/*.h $procdir/SubProcesses/*.cc $procdir/SubProcesses/*.h $procdir/SubProcesses/P*/check_sa.cc $procdir/SubProcesses/P*/CPPProcess.cc $procdir/SubProcesses/P*/CPPProcess.h $procdir/SubProcesses/P*/epoch_process_id.h $procdir/SubProcesses/P*/coloramps.h)
+  files=$(\ls $procdir/src/*.cc $procdir/src/*.h $procdir/SubProcesses/*.cc $procdir/SubProcesses/*.h $procdir/SubProcesses/P*/check_sa.cc $procdir/SubProcesses/P*/CPPProcess.cc $procdir/SubProcesses/P*/CPPProcess.h $procdir/SubProcesses/P*/epoch_process_id.h)
   if [ "$files" == "" ]; then echo "ERROR! No files to check found in directory $TOPDIR/$procdir"; exit 1; fi
   # Check each file
   status=0

@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

Essentially I went back to my three step mapping instead of your one step mapping. Given how complex this is (see the fact that even export_cpp and export_v4 disagree due to fortran/cpp indexing in mg5amcnlo/mg5amcnlo#115), I really think this makes it clearer.

And the generated code diffs are these

git diff --no-ext-diff 22328317716dee0b12e99547d4405d7d55a33e21..fc1b5cd75824f44e99253bc82a0dfecf14ab0f81 susy_gg_t1t1.mad/SubProcesses/ > diff
...
diff --git a/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/CPPProcess.cc b/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/CPPProcess.cc
index 4f46a8c4f..e5f1721f3 100644
--- a/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/CPPProcess.cc
+++ b/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/CPPProcess.cc
@@ -1009,7 +1009,9 @@ namespace mg5amcCpu
     // Event-by-event random choice of color #402
     if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
     {
-      const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
+      const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
+      const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
+      const unsigned int iconfigC = iconfig - 1;                               // icolamp in coloramps.h uses the C array indexing starting at 0
       fptype targetamp[ncolor] = { 0 };
       for( int icolC = 0; icolC < ncolor; icolC++ )
       {
@@ -1052,7 +1054,7 @@ namespace mg5amcCpu
     // - firstprivate: give each thread its own copy, and initialise with value from outside
 #define _OMPLIST0 allcouplings, allMEs, allmomenta, allrndcol, allrndhel, allselcol, allselhel, cGoodHel, cNGoodHel, npagV2
 #ifdef MGONGPU_SUPPORTS_MULTICHANNEL
-#define _OMPLIST1 , allDenominators, allNumerators, channelId, mgOnGpu::icolamp, mgOnGpu::channelId_to_iconfigC
+#define _OMPLIST1 , allDenominators, allNumerators, channelId, mgOnGpu::icolamp, mgOnGpu::channelIdC_to_iconfig
 #else
 #define _OMPLIST1
 #endif
@@ -1124,7 +1126,9 @@ namespace mg5amcCpu
       // Event-by-event random choice of color #402
       if( channelId != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
       {
-        const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
+        const unsigned int channelIdC = channelId - 1;                           // channelIdC_to_iconfig in coloramps.h uses the C array indexing starting at 0
+        const unsigned int iconfig = mgOnGpu::channelIdC_to_iconfig[channelIdC]; // map N_diagrams to N_config <= N_diagrams configs (fix LHE color mismatch #856: see also #826, #852, #853)
+        const unsigned int iconfigC = iconfig - 1;                               // icolamp in coloramps.h uses the C array indexing starting at 0
         fptype_sv targetamp[ncolor] = { 0 };
         for( int icolC = 0; icolC < ncolor; icolC++ )
         {
@@ -1135,7 +1139,6 @@ namespace mg5amcCpu
           if( mgOnGpu::icolamp[iconfigC][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
         }
 #if defined MGONGPU_CPPSIMD and defined MGONGPU_FPTYPE_DOUBLE and defined MGONGPU_FPTYPE2_FLOAT
-        const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[channelId]; // coloramps.h uses a channel ordering not the diagram id
         fptype_sv targetamp2[ncolor] = { 0 };
         for( int icolC = 0; icolC < ncolor; icolC++ )
         {
diff --git a/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/coloramps.h b/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/coloramps.h
index 1a931324b..28a4d6d32 100644
--- a/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/coloramps.h
+++ b/epochX/cudacpp/susy_gg_t1t1.mad/SubProcesses/P1_gg_t1t1x/coloramps.h
@@ -1,50 +1,47 @@
 // Copyright (C) 2020-2024 CERN and UCLouvain.
 // Licensed under the GNU Lesser General Public License (version 3 or later).
 // Created by: A. Valassi (Dec 2022) for the MG5aMC CUDACPP plugin.
-// Further modified by: A. Valassi (2022-2024) for the MG5aMC CUDACPP plugin.
+// Further modified by: O. Mattelaer, A. Valassi (2022-2024) for the MG5aMC CUDACPP plugin.
 
 #ifndef COLORAMPS_H
 #define COLORAMPS_H 1
 
-#include <map>
-
-namespace mgOnGpu
+namespace mgOnGpu /* clang-format off */
 {
   // Summary of numbering and indexing conventions for the relevant concepts (see issue #826 and PR #852)
   // - Diagram number (no variable) in [1, N_diagrams]: all values are allowed (N_diagrams distinct values)
   //   => this number is displayed for information before each block of code in CPPProcess.cc
-  // - Channel number (CHANNEL_ID) in [0, N_diagrams]: not all values are allowed (N_config <= N_diagrams distinct values)
-  //   => this number (with indexing like ps/pdf output) is passed around as an API argument between cudacpp functions
-  //   0 is allowed to fallback to no multi-channel mode.
-  // - Channel number in C indexing: "IconfiC", this is the equivalent of the Fortran iconfig
-  //   iconfigC = iconfig -1
-  //   provides a continuous index [0, N_config-1] for array
-  //  iconfigC = ChannelId_to_iconfigC[channelId]
-  //NOTE: All those ordering are event by event specific (with the intent to have those fix within a vector size/wrap   
-  
-  // Map channelId to iconfigC
-  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
-  // unvalid values are set to -1
-  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
-    __device__ constexpr int channelId_to_iconfigC[7] = {
-     0, // channelId=0: This value means not multi-channel, color will be wrong anyway -> pick the first
-     -1, // channelId=1 (diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
-     0, // channelId=2 (diagram=2) --> iconfig=1 (f77 conv) and iconfigC=0 (c conv)
-     1, // channelId=3 (diagram=3) --> iconfig=2 (f77 conv) and iconfigC=1 (c conv)
-     2, // channelId=4 (diagram=4) --> iconfig=3 (f77 conv) and iconfigC=2 (c conv)
-     3, // channelId=5 (diagram=5) --> iconfig=4 (f77 conv) and iconfigC=3 (c conv)
-     4, // channelId=6 (diagram=6) --> iconfig=5 (f77 conv) and iconfigC=4 (c conv)
+  // - Channel number ("channelId" in C, CHANNEL_ID in F) in [1, N_diagrams]: not all values are allowed (N_config <= N_diagrams distinct values)
+  //   => this number (with F indexing as in ps/pdf output) is passed around as an API argument between cudacpp functions
+  //   Note: the old API passes around a single CHANNEL_ID (and uses CHANNEL_ID=0 to indicate no-multichannel mode, but this is not used in coloramps.h),
+  //   while the new API passes around an array of CHANNEL_ID's (and uses a NULL array pointer to indicate no-multichannel mode)
+  // - Channel number in C indexing: "channelIdC" = channelID - 1
+  //   => this number (with C indexing) is used as the index of the channelIdC_to_iconfig array below
+  // - Config number ("iconfig" in C, ICONFIG in F) in [1, N_config]: all values are allowed (N_config <= N_diagrams distinct values)
+  // - Config number in C indexing: "iconfigC" = iconfig - 1
+  //   => this number (with C indexing) is used as the index of the icolamp array below
+
+  // Map channelIdC (in C indexing, i.e. channelId-1) to iconfig (in F indexing)
+  // Note: iconfig=0 indicates invalid values, i.e. channels/diagrams with no single-diagram enhancement in the MadEvent sampling algorithm (presence of 4-point interaction?)
+  // This array has N_diagrams elements, but only N_config <= N_diagrams valid (non-zero) values
+  __device__ constexpr int channelIdC_to_iconfig[6] = { // note: a trailing comma in the initializer list is allowed
+    0, // CHANNEL_ID=1 i.e. DIAGRAM=1 --> ICONFIG=None
+    1, // CHANNEL_ID=2 i.e. DIAGRAM=2 --> ICONFIG=1
+    2, // CHANNEL_ID=3 i.e. DIAGRAM=3 --> ICONFIG=2
+    3, // CHANNEL_ID=4 i.e. DIAGRAM=4 --> ICONFIG=3
+    4, // CHANNEL_ID=5 i.e. DIAGRAM=5 --> ICONFIG=4
+    5, // CHANNEL_ID=6 i.e. DIAGRAM=6 --> ICONFIG=5
   };
 
   // Map iconfigC (in C indexing, i.e. iconfig-1) to the set of allowed colors
-  // This array has N_config <= N_diagrams elements    
-    __device__ constexpr bool icolamp[5][2] = {
-    {true, true}, // iconfigC=0, diag=2
-    {true, true}, // iconfigC=1, diag=3
-    {true, false}, // iconfigC=2, diag=4
-    {true, false}, // iconfigC=3, diag=5
-    {false, true}, // iconfigC=4, diag=6
-  };
+  // This array has N_config <= N_diagrams elements
+  __device__ constexpr bool icolamp[5][2] = { // note: a trailing comma in the initializer list is allowed
+    {  true,  true }, // ICONFIG=1 <-- CHANNEL_ID=2
+    {  true, false }, // ICONFIG=2 <-- CHANNEL_ID=3
+    {  true, false }, // ICONFIG=3 <-- CHANNEL_ID=4
+    { false,  true }, // ICONFIG=4 <-- CHANNEL_ID=5
+    { false,  true }, // ICONFIG=5 <-- CHANNEL_ID=6
+  }; /* clang-format on */
 
 }
 #endif // COLORAMPS_H

@valassi
Copy link
Member Author

valassi commented Jun 29, 2024

Essentially I went back to my three step mapping instead of your one step mapping

And this means I remove the first element of the channel-to-config array, to just keep the fortran mappings: essentially this array does the fortran-to-fortran mapping, then some variables handle the -1 translation to C indexing. There are Ndiag+1 (7 here) entries in Olivier's code and Ndiag in mine (6 here), note that only Ndiag mappings are needed, not Ndiag+1.

-  // Map channelId to iconfigC
-  // This array has N_diagrams+1 elements, but only N_config <= N_diagrams valid values
-  // unvalid values are set to -1
-  // The 0 entry is a fall back to still write events even if no multi-channel is setup (wrong color selected in that mode) 
-    __device__ constexpr int channelId_to_iconfigC[7] = {
-     0, // channelId=0: This value means not multi-channel, color will be wrong anyway -> pick the first
-     -1, // channelId=1 (diagram=1): Not consider as a channel of integration (presence of 4 point interaction?)
-     0, // channelId=2 (diagram=2) --> iconfig=1 (f77 conv) and iconfigC=0 (c conv)
-     1, // channelId=3 (diagram=3) --> iconfig=2 (f77 conv) and iconfigC=1 (c conv)
-     2, // channelId=4 (diagram=4) --> iconfig=3 (f77 conv) and iconfigC=2 (c conv)
-     3, // channelId=5 (diagram=5) --> iconfig=4 (f77 conv) and iconfigC=3 (c conv)
-     4, // channelId=6 (diagram=6) --> iconfig=5 (f77 conv) and iconfigC=4 (c conv)
...
+  // Map channelIdC (in C indexing, i.e. channelId-1) to iconfig (in F indexing)
+  // Note: iconfig=0 indicates invalid values, i.e. channels/diagrams with no single-diagram enhancement in the MadEvent sampling algorithm (presence of 4-point interaction?)
+  // This array has N_diagrams elements, but only N_config <= N_diagrams valid (non-zero) values
+  __device__ constexpr int channelIdC_to_iconfig[6] = { // note: a trailing comma in the initializer list is allowed
+    0, // CHANNEL_ID=1 i.e. DIAGRAM=1 --> ICONFIG=None
+    1, // CHANNEL_ID=2 i.e. DIAGRAM=2 --> ICONFIG=1
+    2, // CHANNEL_ID=3 i.e. DIAGRAM=3 --> ICONFIG=2
+    3, // CHANNEL_ID=4 i.e. DIAGRAM=4 --> ICONFIG=3
+    4, // CHANNEL_ID=5 i.e. DIAGRAM=5 --> ICONFIG=4
+    5, // CHANNEL_ID=6 i.e. DIAGRAM=6 --> ICONFIG=5
   };

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 1, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 1, 2024
…t on Olivier's latest fix_826 commit d23e773

1) Note about Olivier's latest fix_826 commit d23e773

Olivier's 75c05c5 includes his initial 6 commits in fix_826:

git log upstream/master --oneline -n1
  0992927 (upstream/master, origin/color2, origin/actions) Merge pull request madgraph5#857 from valassi/tmad
git log --oneline 0992927..75c05c5
  75c05c5 Merge branch 'master_june24' into fix_826
  92a8284 better comment in coloramps
  2bcea76 trying to fix git issue
  63494ef change to Andrea convention of naming (but removing step variable)
  5b6d065 increase readibility and move from map to array
  41ddc38 fix a issue for omp compilation
  bed2e12 try to fix the segfault on issue 826

Olivier's d23e773 is then a merge of the latest upstream/master in 75c05c5, fixing the MG5AMC conflict by setting it to 74fd166c1

git show d23e773
  Merge: 75c05c5 0992927
  update this branch with andrea fix in master
  diff --cc MG5aMC/mg5amcnlo
   - Subproject commit 10378b3c0971e1a241fd9dc365e592c92d1f13ba
    -Subproject commit f274cab55d5d983c5612ca7ab3417ee796aa1a8c
   ++Subproject commit 74fd166c1e22bde2dfe01b2e001ac3b177628165

2) Note that, in MG5AMC, 74fd166c1 (obsolete branch gpucpp_826) is the same as 09c96dd17 (branch gpucpp):

git diff 74fd166c1 09c96dd17
  [NO DIFF]

git log --oneline e428e38c6..09c96dd17
  09c96dd17 (origin/gpucpp) allow for second exporter to have access to all variable used in the fortran exporter
  9abf6a3ad Merge pull request madgraph5#113 from valassi/valassi_volatile
  f274cab55 (ghav/valassi_volatile, valassi_volatile) Workaround for SIGFPE crashes in function rotxxx (madgraph5#855): add 'volatile' to prevent optimizations
  0b8678984 Merge pull request madgraph5#112 from valassi/valassi_uninitialised111
  18696c1cf Merge pull request madgraph5#110 from valassi/valassi_leak109
  4f8fbb7f3 (ghav/valassi_uninitialised111) Workaround for issue madgraph5#111 reported by valgrind (initialise goodjet array in function setclscales in reweight.f)
  f6d90fa58 (ghav/valassi_leak109, valassi_leak109) Fix memory leak madgraph5#109 in madevent_driver.f (close file dname.mg)
  f9f957918 (valgrind) Fix validity time check for UFO pickle (madgraph5#97)
  619f5db45 avoid that some parameter switch type when loading model

git log --oneline e428e38c6..74fd166c
  74fd166c1 (HEAD, origin/gpucpp_826, gpucpp_826) Merge remote-tracking branch 'origin/gpucpp' (PR madgraph5#113 for madgraph5#855 crash in rotxxx) into gpucpp_826
  9abf6a3ad Merge pull request madgraph5#113 from valassi/valassi_volatile
  f274cab55 (ghav/valassi_volatile, valassi_volatile) Workaround for SIGFPE crashes in function rotxxx (madgraph5#855): add 'volatile' to prevent optimizations
  e4d9df4ab Merge remote-tracking branch 'origin/gpucpp' (PRs madgraph5#110 and madgraph5#112 for issues madgraph5#109 and madgraph5#111) into gpucpp_826
  0b8678984 Merge pull request madgraph5#112 from valassi/valassi_uninitialised111
  18696c1cf Merge pull request madgraph5#110 from valassi/valassi_leak109
  4f8fbb7f3 (ghav/valassi_uninitialised111) Workaround for issue madgraph5#111 reported by valgrind (initialise goodjet array in function setclscales in reweight.f)
  f6d90fa58 (ghav/valassi_leak109, valassi_leak109) Fix memory leak madgraph5#109 in madevent_driver.f (close file dname.mg)
  10378b3c0 allow for second exporter to have access to all variable used in the fortran exporter
  f9f957918 (valgrind) Fix validity time check for UFO pickle (madgraph5#97)
  619f5db45 avoid that some parameter switch type when loading model

3) Note that color includes the following submodule updates, passing through 09c96dd17 to ba54a4153

git show --oneline upstream/master..color ../../MG5aMC/
  4b29496 [color] update MG5AMC to ba54a4153 in th egpuccp branch, with a minor fix in a comment for my icolamp patch
  Submodule MG5aMC/mg5amcnlo 99e064157..ba54a4153:
    > minor fix in a printout in my previous patch in export_cpp.py
  1c2a02d [color] update MG5AMC to 99e064157, fixing bug madgraph5#856 (and related ones) about the icolamp array in coloramps.h
  Submodule MG5aMC/mg5amcnlo 09c96dd17..99e064157:
    > In export_cpp.py fix bug madgraph5#114 in get_icolamp_lines, resulting in different icolamp arrays for F77 and CPP (see madgraph5#873)
  0a60262 [color] update MG5AMC to 09c96dd17: this is the latest gpucpp branch, now including Olivier's extra commit previously in gpucpp_826
  Submodule MG5aMC/mg5amcnlo 10378b3c0...09c96dd17:
    > allow for second exporter to have access to all variable used in the fortran exporter
    > Merge pull request madgraph5#113 from valassi/valassi_volatile
    > Merge pull request madgraph5#112 from valassi/valassi_uninitialised111
    > Merge pull request madgraph5#110 from valassi/valassi_leak109
    < allow for second exporter to have access to all variable used in the fortran exporter
  16ff942 try to fix the segfault on issue 826
  Submodule MG5aMC/mg5amcnlo f9f957918..10378b3c0:
    > allow for second exporter to have access to all variable used in the fortran exporter
  4b12e79 [color] temporarely downgrade back MG5AMC to the common base of gpucpp and gpucpp_826, to allow cherry-picking Olivier's fix_826 changes >
  Submodule MG5aMC/mg5amcnlo f274cab55..f9f957918 (rewind):
    < Workaround for SIGFPE crashes in function rotxxx (madgraph5#855): add 'volatile' to prevent optimizations
    < Merge pull request madgraph5#112 from valassi/valassi_uninitialised111
    < Merge pull request madgraph5#110 from valassi/valassi_leak109

=> Therefore I can simply merge origin/color into color2 and fix the MG5AMC conflict by setting it to ba54a4153 (valassi_icolamp114, before more recent changes)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 1, 2024
…eplacing madgraph5#873)

Fix conflicts:
	MG5aMC/mg5amcnlo
	epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/coloramps.h
	epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc
	epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/model_handling.py

In all four cases, simply take the code version from branch color.
In particular, fix the MG5AMC conflict by setting it to ba54a4153 (valassi_icolamp114 in mg5amcnlo/mg5amcnlo#115, before more recent changes)

Note: the content of this branch is now identical to color

git log color --oneline -n5
  93a547f (origin/color, color) [color] ** COMPLETE COLOR ** add a tmad/gitdifftmad.sh for easier diffs of tmad logs
  643466f [color] add a tput/gitdifftput.sh for easier diffs of tput logs
  46356d6 [color] rerun 30 tmad tests on itscrd90 - all as expected (failures in heft madgraph5#833, susy madgraph5#826 and also gqttq madgraph5#845 - but ggttgg madgraph5#856 is fixed)
  2194e83 [color] rerun 102 tput tests on itscrd90 - all ok (after fixing madgraph5#856 in tmad)
  b3046e1 [color] in .github/workflows/testsuite_oneprocess.sh, temporarely reenable bypasses for know issues madgraph5#826 in susy and madgraph5#872 in pp_tt012j - the CI tests should pass now

git diff 93a547f
  [NO DIFF]
@valassi valassi changed the title Fixes for iconfig-channel mappings in coloramps.h Fixes for iconfig-channel mappings in coloramps.h (obsolete, will be closed) Jul 1, 2024
@valassi
Copy link
Member Author

valassi commented Jul 1, 2024

This has conflicts in branch color (because it was modified to be agains fix_826). I opened a new PR #877 with no conflicts, based on branch color2 (which can be both against fix_826 or against master)

Eventually this will be closed and #877 merged, I guess

@valassi
Copy link
Member Author

valassi commented Jul 2, 2024

This is obsolete because it has merge conflicts. It has been replaced by #877 which is functionally equivalent (and includes improvements suggested by Olivier). Closing as unmerged, at commit 93a547f - I will probably delete/change the color branch.

@valassi valassi closed this Jul 2, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 3, 2024
@valassi valassi deleted the color branch July 4, 2024 07:54
@valassi
Copy link
Member Author

valassi commented Jul 4, 2024

For the record I did delete the obsolete color branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants