multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

oliviermattelaer · 2022-01-21T15:46:33Z

Hi guys,

I have a code that can output the multi-channel information inside the "basic" gpu class.
The code (for gg>tt~) looks like this:

VVV1P0_1(w[0],w[1],cxtype(cIPC[0],cIPC[1]),0.,0.,w[4]);
# Amplitude(s) for diagram number 1
FFV1_0(w[3],w[2],w[4],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 1){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[0] += +cxtype(0,1)*amp[0];
jamp[1] += -cxtype(0,1)*amp[0];
FFV1_1(w[2],w[0],cxtype(cIPC[2],cIPC[3]),cIPD[0],cIPD[1],w[4]);
# Amplitude(s) for diagram number 2
FFV1_0(w[3],w[4],w[1],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 2){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[0] += -amp[0];
FFV1_2(w[3],w[0],cxtype(cIPC[2],cIPC[3]),cIPD[0],cIPD[1],w[4]);
# Amplitude(s) for diagram number 3
FFV1_0(w[4],w[2],w[1],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 3){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[1] += -amp[0];

@valassi @roiser, I was hesitating to use std:norm(amp[0]). Any preference here?
(I still have to edit template to initialize/use such variable but this way the impact on memory should be quite minimal.

The text was updated successfully, but these errors were encountered:

valassi · 2022-01-21T18:13:08Z

Hi @oliviermattelaer thta's great news! :-)

I see how thic connects to the MadEvent algorithm, nice and easy actually.

About std::norm, I would avoid that for the moment. I imagine those amp[0] are cxtype_sv if I remember well. The way you write this explicitly should ensure they are properly vectorized (computed in parallele for the 4 events in one cxtype_v vector), while with std norm I am not sure.

Note that I achanged a bit the code in the meantime (and am still changing it). so for instance now all those FFVs are templated. The goal is to go to splitting the kernel into smaller kernels eventually.

Another option (which may also n easier short term solution, considering the issues of #341) is that you send me the code how it should look like, then I merge both with my templated new functions and with the code generation. Sounds quite easy to do anyway, I just need the new signature of calculate_wavefunctions and these kind of details.

Let me know, thanks again, we are progressing well!
a

valassi · 2022-04-28T17:22:43Z

I have just closed #360 about integrating with the new 311 branch.

The "picking up new features from the 311 branch" is mainly the multichannel stuff. This is discussed in this #342. I have just renamed it to mention the 311 branch:

Olivier has done the code in the 311 branch (in cudacpp)
We need to integrate that in codegen ("do a diff of madgraph/iolibs/export_cpp.py in the 270 and 311 installations, this is where to start")

valassi · 2022-05-22T15:11:20Z

Starting to look at this.

I can generate the new code with my generate script in "--madgpu" mode (vector + me exporter gpu).

Not yet sure how easy or complex it will be. The problem is that my standalone_cudacpp plugin and Olivier's standalone_gpu have diverged quite a bit. So I need port parts of the latter into the former.

As mentioned above, looking at what changed in export_cpp.py is a good idea. I hope I do not need to go in times further back than rev 983 in 311 (merge with 270). Anyway, I will try to work only on branch 311. Some interesting changes

bzr diff madgraph/iolibs/export_cpp.py -r 983..1009 --using tkdiff

There are also a few changes in templates I imagine...

valassi · 2022-05-22T15:17:05Z

Some changes are in

madgraph/iolibs/template_files/gpu/process_function_definitions.inc:%(sigmaKin_lines)s

I do have that % in my template .inc. But the final result is different in CPPProcess.cc. I should get something like this (from ggtt.gpumad), and I dont yet:

    // CUDA - using precomputed good helicities
    for (int ighel = 0; ighel < cNGoodHel[0]; ighel++ )
    {
      const int ihel = cGoodHel[ighel]; 
      calculate_wavefunctions(ihel, allmomenta, meHelSum[0] &multi_chanel_num,
          &multi_chanel_denom);
    }

I am missing the multi_chanel_num and multi_chanel_denom. Investigating.

valassi · 2022-05-22T15:53:34Z

OK here comes something to do

    def get_all_sigmaKin_lines(self, color_amplitudes, class_name):
        """Get sigmaKin_process for all subprocesses for Pythia 8 .cc file"""

        ret_lines = []
        if self.single_helicities:
            
            template = "__device__ void calculate_wavefunctions(int ihel, const fptype* allmomenta,fptype &meHelSum \n#ifndef __CUDACC__\n                                , const int ievt\n#endif\n %(multi_channel)s                               )\n{"
            
            if self.include_multi_channel:
                info = {'multi_channel': self.multichannel_var}  
            else:
                info = {'multi_channel': ''}

Above is Olivier's code. I checked that the plugin has self.include_multi_channel set correctly (for ggtt.mad, NOT for gg_tt standalone! keep this in mind, we can only have multichannel in the .mad code generator... where we do have a standalone executable anyway).

What I need to do is modify get_all_sigmaKin_lines. This is a method that in the plugin I rewrite from scratch, I cannot reuse parts of the base class.

valassi · 2022-05-22T16:00:01Z

And this is the template that I must also change in parallel

bzr diff madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc -r 983..1009
=== modified file 'madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc'
--- madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc    2020-11-23 16:03:28 +0000
+++ madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc    2022-03-09 10:16:59 +0000
@@ -24,13 +24,14 @@
 
       // Reset the "matrix elements" - running sums of |M|^2 over helicities for the given event
       fptype meHelSum[nprocesses] = { 0 }; // all zeros
+      %(madE_var_reset)s
 
 #ifdef __CUDACC__
       // CUDA - using precomputed good helicities
       for ( int ighel = 0; ighel < cNGoodHel[0]; ighel++ )
       {
         const int ihel = cGoodHel[ighel];
-        calculate_wavefunctions( ihel, allmomenta, meHelSum[0] );
+        calculate_wavefunctions( ihel, allmomenta, meHelSum[0] %(madE_caclwfcts_call)s);
       }
 #else
       // C++ - compute good helicities within this loop
@@ -39,7 +40,7 @@
       {
         if ( sigmakin_itry>maxtry && !sigmakin_goodhel[ihel] ) continue;
         // NB: calculate_wavefunctions ADDS |M|^2 for a given ihel to the running sum of |M|^2 over helicities for the given event
-        calculate_wavefunctions( ihel, allmomenta, meHelSum[0], ievt );
+        calculate_wavefunctions( ihel, allmomenta, meHelSum[0], ievt %(madE_caclwfcts_call)s);
         if ( sigmakin_itry<=maxtry )
         {
           if ( !sigmakin_goodhel[ihel] && meHelSum[0]>meHelSumLast ) sigmakin_goodhel[ihel] = true;
@@ -58,7 +59,9 @@
       // Set the final average |M|^2 for this event in the output array for all events
       for (int iproc = 0; iproc < nprocesses; ++iproc){
         allMEs[iproc*nprocesses + ievt] = meHelSum[iproc];
-      }
+      %(madE_update_answer)s
+     }
+
 
 #ifndef __CUDACC__
       if ( sigmakin_itry <= maxtry )

Essentially todo:

add the three madE_ variables to the template
(they should be defined in get_sigmaKin_lines, that I do NOT modify in the plugin)
add some modifications in get_all_sigmaKin_lines, however

valassi · 2022-05-22T16:38:11Z

In parallel, this is the next thing I need to change

bzr diff -r 983..1009 madgraph/iolibs/helas_call_writers.py
=== modified file 'madgraph/iolibs/helas_call_writers.py'
--- madgraph/iolibs/helas_call_writers.py       2022-01-07 14:01:30 +0000
+++ madgraph/iolibs/helas_call_writers.py       2022-03-09 10:46:59 +0000
@@ -1954,7 +1954,7 @@
 
     pass
 
-    def get_matrix_element_calls(self, matrix_element, color_amplitudes):
+    def get_matrix_element_calls(self, matrix_element, color_amplitudes, multi_channel_map=False):
         """Return a list of strings, corresponding to the Helas calls
         for the matrix element"""
 
@@ -1978,13 +1978,26 @@
         
         
         me = matrix_element.get('diagrams')
+
+
         matrix_element.reuse_outdated_wavefunctions(me)
 
+        misc.sprint(multi_channel_map)
+
         res = []
         # reset jamp:
         res.append('for(int i=0;i<%s;i++){jamp[i] = cxtype(0.,0.);}'
                    % len(color_amplitudes))
-        
+        diagrams = matrix_element.get('diagrams')
+        diag_to_config = {}
+        if multi_channel_map:
+            for config in sorted(multi_channel_map.keys()):
+                amp = [a.get('number') for a in \
+                                  sum([diagrams[idiag].get('amplitudes') for \
+                                       idiag in multi_channel_map[config]], [])]
+                diag_to_config[amp[0]] = config
+        misc.sprint(diag_to_config)
+        id_amp = 0
         for diagram in matrix_element.get('diagrams'):
              
             res.extend([ self.get_wavefunction_call(wf) for \
@@ -1992,14 +2005,18 @@
             res.append("# Amplitude(s) for diagram number %d" % 
                        diagram.get('number'))
             for amplitude in diagram.get('amplitudes'):
+                id_amp +=1
                 namp = amplitude.get('number')
                 amplitude.set('number', 1)
                 res.append(self.get_amplitude_call(amplitude))
+                # amp2
+                if id_amp in diag_to_config:
+                    res.append("if(channel_id == %i){multi_chanel_num += conj(amp[0])*amp[0];};" % diag_to_config[id_amp])
+                    res.append(" multi_chanel_denom += conj(amp[0])*amp[0];")
+                # jamp
                 for njamp, coeff in color[namp].items():
                     res.append("jamp[%s] += %samp[0];" % 
                          (njamp, export_cpp.OneProcessExporterGPU.coeff(*coeff)))
-                                                                                                        
-

valassi · 2022-05-22T18:21:38Z

I am progressing on WIP PR #465. But I am having a couple of technical problems, as described there.

One is internal, I think I need to allocate two arrays for nevt numerators and nevt denominators (maybe I could avoid it in the short term, but most likely this is needed for splitting kernels, anyway).

The second one is about the fact that channel_id does not exist in the sigmakin interface (and a fortiori in the bridge interfaces), so I have to add it in all those places. @oliviermattelaer is this correct or am I missing something that you already developed? Thanks Andrea

valassi · 2022-06-08T09:59:36Z

Several issues have been identified and fixed in the meantime as described in PR #465, which I am about to merge. A few other points remain about multichannel integration (eg #466), but this #342 is now largely done and can be closed.

This was referenced Mar 11, 2022

MadEvent final bridge interface: add multichannel, alphas, color index #404

Closed

Prepare Makefile to combine MadEvent + Cudacpp plugin #400

Closed

valassi changed the title ~~multi-channel implementation in cpp/cuda class~~ multi-channel implementation in cpp/cuda class (pick it up from 311 branch) Apr 28, 2022

valassi mentioned this issue Apr 28, 2022

Port epochX CODEGEN from 270gpu to the 311lovec branch (and pick up the new features there!) #360

Closed

valassi changed the title ~~multi-channel implementation in cpp/cuda class (pick it up from 311 branch)~~ multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) Apr 28, 2022

valassi mentioned this issue May 22, 2022

First patch to integrate multichannel (single diagram enhancement) into cudacpp #465

Merged

valassi linked a pull request Jun 2, 2022 that will close this issue

First patch to integrate multichannel (single diagram enhancement) into cudacpp #465

Merged

valassi closed this as completed Jun 8, 2022

valassi assigned valassi and oliviermattelaer Jun 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

oliviermattelaer commented Jan 21, 2022

valassi commented Jan 21, 2022

valassi commented Apr 28, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented Jun 8, 2022

multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

Comments

oliviermattelaer commented Jan 21, 2022

valassi commented Jan 21, 2022

valassi commented Apr 28, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented May 22, 2022

valassi commented Jun 8, 2022