-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi-channel implementation in cudacpp codegen plugin (pick it up from 311 branch cpp/gpu standalone) #342
Comments
Hi @oliviermattelaer thta's great news! :-) I see how thic connects to the MadEvent algorithm, nice and easy actually. About std::norm, I would avoid that for the moment. I imagine those amp[0] are cxtype_sv if I remember well. The way you write this explicitly should ensure they are properly vectorized (computed in parallele for the 4 events in one cxtype_v vector), while with std norm I am not sure. Note that I achanged a bit the code in the meantime (and am still changing it). so for instance now all those FFVs are templated. The goal is to go to splitting the kernel into smaller kernels eventually. Another option (which may also n easier short term solution, considering the issues of #341) is that you send me the code how it should look like, then I merge both with my templated new functions and with the code generation. Sounds quite easy to do anyway, I just need the new signature of calculate_wavefunctions and these kind of details. Let me know, thanks again, we are progressing well! |
I have just closed #360 about integrating with the new 311 branch. The "picking up new features from the 311 branch" is mainly the multichannel stuff. This is discussed in this #342. I have just renamed it to mention the 311 branch:
|
Starting to look at this. I can generate the new code with my generate script in "--madgpu" mode (vector + me exporter gpu). Not yet sure how easy or complex it will be. The problem is that my standalone_cudacpp plugin and Olivier's standalone_gpu have diverged quite a bit. So I need port parts of the latter into the former. As mentioned above, looking at what changed in export_cpp.py is a good idea. I hope I do not need to go in times further back than rev 983 in 311 (merge with 270). Anyway, I will try to work only on branch 311. Some interesting changes
There are also a few changes in templates I imagine... |
Some changes are in
I do have that % in my template .inc. But the final result is different in CPPProcess.cc. I should get something like this (from ggtt.gpumad), and I dont yet:
I am missing the multi_chanel_num and multi_chanel_denom. Investigating. |
OK here comes something to do
Above is Olivier's code. I checked that the plugin has self.include_multi_channel set correctly (for ggtt.mad, NOT for gg_tt standalone! keep this in mind, we can only have multichannel in the .mad code generator... where we do have a standalone executable anyway). What I need to do is modify get_all_sigmaKin_lines. This is a method that in the plugin I rewrite from scratch, I cannot reuse parts of the base class. |
And this is the template that I must also change in parallel
Essentially todo:
|
In parallel, this is the next thing I need to change
|
I am progressing on WIP PR #465. But I am having a couple of technical problems, as described there. One is internal, I think I need to allocate two arrays for nevt numerators and nevt denominators (maybe I could avoid it in the short term, but most likely this is needed for splitting kernels, anyway). The second one is about the fact that channel_id does not exist in the sigmakin interface (and a fortiori in the bridge interfaces), so I have to add it in all those places. @oliviermattelaer is this correct or am I missing something that you already developed? Thanks Andrea |
Hi guys,
I have a code that can output the multi-channel information inside the "basic" gpu class.
The code (for gg>tt~) looks like this:
@valassi @roiser, I was hesitating to use std:norm(amp[0]). Any preference here?
(I still have to edit template to initialize/use such variable but this way the impact on memory should be quite minimal.
The text was updated successfully, but these errors were encountered: