Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342

Closed
oliviermattelaer opened this issue Jan 21, 2022 · 9 comments · Fixed by #465
Assignees

Comments

@oliviermattelaer
Copy link
Member

Hi guys,

I have a code that can output the multi-channel information inside the "basic" gpu class.
The code (for gg>tt~) looks like this:

VVV1P0_1(w[0],w[1],cxtype(cIPC[0],cIPC[1]),0.,0.,w[4]);
# Amplitude(s) for diagram number 1
FFV1_0(w[3],w[2],w[4],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 1){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[0] += +cxtype(0,1)*amp[0];
jamp[1] += -cxtype(0,1)*amp[0];
FFV1_1(w[2],w[0],cxtype(cIPC[2],cIPC[3]),cIPD[0],cIPD[1],w[4]);
# Amplitude(s) for diagram number 2
FFV1_0(w[3],w[4],w[1],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 2){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[0] += -amp[0];
FFV1_2(w[3],w[0],cxtype(cIPC[2],cIPC[3]),cIPD[0],cIPD[1],w[4]);
# Amplitude(s) for diagram number 3
FFV1_0(w[4],w[2],w[1],cxtype(cIPC[2],cIPC[3]),&amp[0]);
if(channel_id == 3){multi_chanel_num += conj(amp[0])*amp[0];};
 multi_chanel_denom += conj(amp[0])*amp[0];};
jamp[1] += -amp[0];

@valassi @roiser, I was hesitating to use std:norm(amp[0]). Any preference here?
(I still have to edit template to initialize/use such variable but this way the impact on memory should be quite minimal.

@valassi
Copy link
Member

valassi commented Jan 21, 2022

Hi @oliviermattelaer thta's great news! :-)

I see how thic connects to the MadEvent algorithm, nice and easy actually.

About std::norm, I would avoid that for the moment. I imagine those amp[0] are cxtype_sv if I remember well. The way you write this explicitly should ensure they are properly vectorized (computed in parallele for the 4 events in one cxtype_v vector), while with std norm I am not sure.

Note that I achanged a bit the code in the meantime (and am still changing it). so for instance now all those FFVs are templated. The goal is to go to splitting the kernel into smaller kernels eventually.

Another option (which may also n easier short term solution, considering the issues of #341) is that you send me the code how it should look like, then I merge both with my templated new functions and with the code generation. Sounds quite easy to do anyway, I just need the new signature of calculate_wavefunctions and these kind of details.

Let me know, thanks again, we are progressing well!
a

@valassi valassi changed the title multi-channel implementation in cpp/cuda class multi-channel implementation in cpp/cuda class (pick it up from 311 branch) Apr 28, 2022
@valassi
Copy link
Member

valassi commented Apr 28, 2022

I have just closed #360 about integrating with the new 311 branch.

The "picking up new features from the 311 branch" is mainly the multichannel stuff. This is discussed in this #342. I have just renamed it to mention the 311 branch:

  • Olivier has done the code in the 311 branch (in cudacpp)
  • We need to integrate that in codegen ("do a diff of madgraph/iolibs/export_cpp.py in the 270 and 311 installations, this is where to start")

@valassi valassi changed the title multi-channel implementation in cpp/cuda class (pick it up from 311 branch) multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) Apr 28, 2022
@valassi
Copy link
Member

valassi commented May 22, 2022

Starting to look at this.

I can generate the new code with my generate script in "--madgpu" mode (vector + me exporter gpu).

Not yet sure how easy or complex it will be. The problem is that my standalone_cudacpp plugin and Olivier's standalone_gpu have diverged quite a bit. So I need port parts of the latter into the former.

As mentioned above, looking at what changed in export_cpp.py is a good idea. I hope I do not need to go in times further back than rev 983 in 311 (merge with 270). Anyway, I will try to work only on branch 311. Some interesting changes

bzr diff madgraph/iolibs/export_cpp.py -r 983..1009 --using tkdiff

There are also a few changes in templates I imagine...

@valassi
Copy link
Member

valassi commented May 22, 2022

Some changes are in

madgraph/iolibs/template_files/gpu/process_function_definitions.inc:%(sigmaKin_lines)s

I do have that % in my template .inc. But the final result is different in CPPProcess.cc. I should get something like this (from ggtt.gpumad), and I dont yet:

    // CUDA - using precomputed good helicities
    for (int ighel = 0; ighel < cNGoodHel[0]; ighel++ )
    {
      const int ihel = cGoodHel[ighel]; 
      calculate_wavefunctions(ihel, allmomenta, meHelSum[0] &multi_chanel_num,
          &multi_chanel_denom);
    }

I am missing the multi_chanel_num and multi_chanel_denom. Investigating.

@valassi
Copy link
Member

valassi commented May 22, 2022

OK here comes something to do

    def get_all_sigmaKin_lines(self, color_amplitudes, class_name):
        """Get sigmaKin_process for all subprocesses for Pythia 8 .cc file"""

        ret_lines = []
        if self.single_helicities:
            
            template = "__device__ void calculate_wavefunctions(int ihel, const fptype* allmomenta,fptype &meHelSum \n#ifndef __CUDACC__\n                                , const int ievt\n#endif\n %(multi_channel)s                               )\n{"
            
            if self.include_multi_channel:
                info = {'multi_channel': self.multichannel_var}  
            else:
                info = {'multi_channel': ''}

Above is Olivier's code. I checked that the plugin has self.include_multi_channel set correctly (for ggtt.mad, NOT for gg_tt standalone! keep this in mind, we can only have multichannel in the .mad code generator... where we do have a standalone executable anyway).

What I need to do is modify get_all_sigmaKin_lines. This is a method that in the plugin I rewrite from scratch, I cannot reuse parts of the base class.

@valassi
Copy link
Member

valassi commented May 22, 2022

And this is the template that I must also change in parallel

bzr diff madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc -r 983..1009
=== modified file 'madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc'
--- madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc    2020-11-23 16:03:28 +0000
+++ madgraph/iolibs/template_files/gpu/process_sigmaKin_function.inc    2022-03-09 10:16:59 +0000
@@ -24,13 +24,14 @@
 
       // Reset the "matrix elements" - running sums of |M|^2 over helicities for the given event
       fptype meHelSum[nprocesses] = { 0 }; // all zeros
+      %(madE_var_reset)s
 
 #ifdef __CUDACC__
       // CUDA - using precomputed good helicities
       for ( int ighel = 0; ighel < cNGoodHel[0]; ighel++ )
       {
         const int ihel = cGoodHel[ighel];
-        calculate_wavefunctions( ihel, allmomenta, meHelSum[0] );
+        calculate_wavefunctions( ihel, allmomenta, meHelSum[0] %(madE_caclwfcts_call)s);
       }
 #else
       // C++ - compute good helicities within this loop
@@ -39,7 +40,7 @@
       {
         if ( sigmakin_itry>maxtry && !sigmakin_goodhel[ihel] ) continue;
         // NB: calculate_wavefunctions ADDS |M|^2 for a given ihel to the running sum of |M|^2 over helicities for the given event
-        calculate_wavefunctions( ihel, allmomenta, meHelSum[0], ievt );
+        calculate_wavefunctions( ihel, allmomenta, meHelSum[0], ievt %(madE_caclwfcts_call)s);
         if ( sigmakin_itry<=maxtry )
         {
           if ( !sigmakin_goodhel[ihel] && meHelSum[0]>meHelSumLast ) sigmakin_goodhel[ihel] = true;
@@ -58,7 +59,9 @@
       // Set the final average |M|^2 for this event in the output array for all events
       for (int iproc = 0; iproc < nprocesses; ++iproc){
         allMEs[iproc*nprocesses + ievt] = meHelSum[iproc];
-      }
+      %(madE_update_answer)s
+     }
+
 
 #ifndef __CUDACC__
       if ( sigmakin_itry <= maxtry )

Essentially todo:

  • add the three madE_ variables to the template
  • (they should be defined in get_sigmaKin_lines, that I do NOT modify in the plugin)
  • add some modifications in get_all_sigmaKin_lines, however

@valassi
Copy link
Member

valassi commented May 22, 2022

In parallel, this is the next thing I need to change

bzr diff -r 983..1009 madgraph/iolibs/helas_call_writers.py
=== modified file 'madgraph/iolibs/helas_call_writers.py'
--- madgraph/iolibs/helas_call_writers.py       2022-01-07 14:01:30 +0000
+++ madgraph/iolibs/helas_call_writers.py       2022-03-09 10:46:59 +0000
@@ -1954,7 +1954,7 @@
 
     pass
 
-    def get_matrix_element_calls(self, matrix_element, color_amplitudes):
+    def get_matrix_element_calls(self, matrix_element, color_amplitudes, multi_channel_map=False):
         """Return a list of strings, corresponding to the Helas calls
         for the matrix element"""
 
@@ -1978,13 +1978,26 @@
         
         
         me = matrix_element.get('diagrams')
+
+
         matrix_element.reuse_outdated_wavefunctions(me)
 
+        misc.sprint(multi_channel_map)
+
         res = []
         # reset jamp:
         res.append('for(int i=0;i<%s;i++){jamp[i] = cxtype(0.,0.);}'
                    % len(color_amplitudes))
-        
+        diagrams = matrix_element.get('diagrams')
+        diag_to_config = {}
+        if multi_channel_map:
+            for config in sorted(multi_channel_map.keys()):
+                amp = [a.get('number') for a in \
+                                  sum([diagrams[idiag].get('amplitudes') for \
+                                       idiag in multi_channel_map[config]], [])]
+                diag_to_config[amp[0]] = config
+        misc.sprint(diag_to_config)
+        id_amp = 0
         for diagram in matrix_element.get('diagrams'):
              
             res.extend([ self.get_wavefunction_call(wf) for \
@@ -1992,14 +2005,18 @@
             res.append("# Amplitude(s) for diagram number %d" % 
                        diagram.get('number'))
             for amplitude in diagram.get('amplitudes'):
+                id_amp +=1
                 namp = amplitude.get('number')
                 amplitude.set('number', 1)
                 res.append(self.get_amplitude_call(amplitude))
+                # amp2
+                if id_amp in diag_to_config:
+                    res.append("if(channel_id == %i){multi_chanel_num += conj(amp[0])*amp[0];};" % diag_to_config[id_amp])
+                    res.append(" multi_chanel_denom += conj(amp[0])*amp[0];")
+                # jamp
                 for njamp, coeff in color[namp].items():
                     res.append("jamp[%s] += %samp[0];" % 
                          (njamp, export_cpp.OneProcessExporterGPU.coeff(*coeff)))
-                                                                                                        
-

@valassi
Copy link
Member

valassi commented May 22, 2022

I am progressing on WIP PR #465. But I am having a couple of technical problems, as described there.

One is internal, I think I need to allocate two arrays for nevt numerators and nevt denominators (maybe I could avoid it in the short term, but most likely this is needed for splitting kernels, anyway).

The second one is about the fact that channel_id does not exist in the sigmakin interface (and a fortiori in the bridge interfaces), so I have to add it in all those places. @oliviermattelaer is this correct or am I missing something that you already developed? Thanks Andrea

@valassi
Copy link
Member

valassi commented Jun 8, 2022

Several issues have been identified and fixed in the meantime as described in PR #465, which I am about to merge. A few other points remain about multichannel integration (eg #466), but this #342 is now largely done and can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants