Split the sigmakin kernel into smaller kernels #310

valassi · 2021-12-09T15:52:34Z

This is just a placeholder to discuss the idea of implementing smaller kernels.

A lot of pointers already exist related to this:

Stefan's WIP PR WIP - Split kernels and more #242
issue cuda graphs #12 about cuda graphs
issue Cuda streams #11 about cuda streams
issue "xxx" function interface: further separation of data access and calculations? #175 about the internal APIs

One of the main points towards using smaller kernels is the need to allow each ixx/oxxx and each ffv function to handle pointers to large buffers for many events and to do the indexing themselves. This is discussed in #175 (comment) for instance. Presently instead only the ixx/oxx functions are able to find an event in the input array, but then their output (and all inputs/outputs of the ffv functions) refer for CUDA to a single event. This is the first thing that must be changed to allow smaller kernels.

valassi · 2021-12-11T10:26:16Z

Note a basic prototype idea in #313: it would be enough to add for instance bufferAccessWavefunction( fptype* buffer, int iw6 ) wherethe iw6 index is passed in the same way as the ip4 now.

valassi · 2022-01-20T18:55:53Z

Note that PR #328 is the first step for this and contains several comments related to this

valassi · 2022-02-03T10:41:59Z

The color algebra optimisation has a separate issue #155. Improving the timing measurements is in #372.

valassi · 2022-04-22T11:08:39Z

One of the main issues in splitting kernels is the clarification of the relative roles of MEK and CPPProcess: who holds intermediate data buffers? who orchestrates the order of kernels? who is allowed to have process specific stuff? I am discussing this largely in #356, specifically in the econtxt of running alphas issue #373 and draft PR #434

This was referenced Dec 9, 2021

"xxx" function interface: further separation of data access and calculations? #175

Open

New buffer access in xxx (and ffv) functions - remove ipagV from C++ api #313

Merged

valassi mentioned this issue Feb 3, 2022

Improve timer infrastructure to allow finer granularity with split kernels #372

Open

valassi mentioned this issue Feb 3, 2022

Fix timing measurements in the bridge mode of check_sa #371

Open

This was referenced Apr 22, 2022

Running of alphas (QCD coupling) #373

Closed

AlphaS against latest master, including code generation and all processes generated #434

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split the sigmakin kernel into smaller kernels #310

Split the sigmakin kernel into smaller kernels #310

valassi commented Dec 9, 2021 •

edited

Loading

valassi commented Dec 11, 2021

valassi commented Jan 20, 2022

valassi commented Feb 3, 2022

valassi commented Apr 22, 2022

Split the sigmakin kernel into smaller kernels #310

Split the sigmakin kernel into smaller kernels #310

Comments

valassi commented Dec 9, 2021 • edited Loading

valassi commented Dec 11, 2021

valassi commented Jan 20, 2022

valassi commented Feb 3, 2022

valassi commented Apr 22, 2022

valassi commented Dec 9, 2021 •

edited

Loading