Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

valassi · 2023-02-27T10:53:57Z

Thisis ajust a reminder for myself in cudacpp.

In any normal build with a fixed AVX value we are now building together the cuda and cpp gcheck/check.exe, gCPPProcess.o/CPPProcess.o, and a single runTest.exe.

There are many reasons why we should find a better way

there is no way to build only cuda (while there is a simple way to disable cuda with CUDA_HOME=invalid), this may be interesting for instance when gcc fails for some large processes (see WIP: gg to ttgggg (2->6 process) #601)
my avxall tests/builds now essentially rebuild cuda five times, which is silly
the runtest.exe is a same executable with different meaning depending on whether cuda exists or not (maybe use a runTest.sh script that calls two separate execuatbles instead or even three, common, cuda, cpp?)
eventually we will probably move to fat libraries
we need to repackage everything for full madeven integration of "launch" (Workplan (December 2022) towards a first alpha release #576)...

valassi · 2023-06-05T13:06:03Z

I had forgotten this one. This is very much needed. Related to MR #681 (merged, decrease cpp build dependency on nvtx and curand) and #674 (separate madevent fortran/cuda/cpp builds) and #680 (avoid rebuilding cuda 5 times).

So this is very much a current issue..

valassi · 2023-07-19T10:44:17Z

Note: in MR #723 fixing #725, I improved the separation of cpu and gpu namespaces (so now it is a bit sfare to mix the two codes... though it is maybe a better idea not to do that anyway).

valassi · 2023-08-17T15:48:01Z

Hi @roiser @Jooorgen this issue #602 (together with #680) is almost spot on with the things that we just discussed. So, rather than opening a new one, I will document here our discussion. Also cc @hageboeck @zeniheisser @oliviermattelaer

The actual motivation of our discussion was about adding a HIP target to the builds and increase the control over it (in Jorgen's present implementation, HIP libraries are built only if a hip compiler exists but the cuda compiler does not exist). However, the strategy we discussed can/should be implemented even bore HIP is introduced. (Therefore I suggest creating a MR on upstream/master that only deals with cpp and cuda... then adding a hip target will become trivial).

OVERVIEW

The idea is that

make with no target should no longer do anything, it can give an error: you must specify a target
this would apply both in the cudacpp and the fortran makefiles

FIRST PART: cudacpp.mk

(Eventually, we might think of replacing our 'none', 'sse4' names etc by something more portable to ARM etc. For the moment, lets stick to those.)

(Also, we discussed whether to use specific x86 targets, or even the new v1/v2 of https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels, I would avoid this for now as we will not do any cross compilation of arm from x86 etc).

This means that for cudacpp.mk we should support these targets

make -f cudacpp.mk cuda
make -f cudacpp.mk cppnone
make -f cudacpp.mk cppsse4
make -f cudacpp.mk cppavx2
make -f cudacpp.mk cpp512y
make -f cudacpp.mk cpp512z

These different targets would build ONLY either cuda or a specific AVX build.

The AVX variable should disappear from the makefile as it is absorbed in the target. (Presently we run make -f cudacpp.mk AVX=sse4 for instance, with no explicit target but with an AVX variable, this will become make -f cudacpp.mk cppsse4).

The selection of the "best" AVX for a specific target should also disappear from the makefile (unless we think for instance of adding make -f cudacpp.mk cppbest that then invokes one of the five options).

Each build of cuda or one cpp would go to one directory. For the moment, let's keep USEBUILDDIR as optional and supported. Whether in the local directory (USEBUILDDIR=0) or in a build specific build dir (USEBUILDDIR=1), lets keep the lockfile preventing the mixing of builds:

.build.512y_d_inl0_hrd0_hasCurand however would become .build.cpp512y_d_inl0_hrd0_hasCurand or .build.cuda_d_inl0_hrd0_hasCurand (or maybe we can keep the 5 existing names the same, but we must add one for cuda!)
similarly, the build directories (previously build.512y_d_inl0_hrd0 included both cuda and cpp) would become build.cpp512y_d_inl0_hrd0 or build.cuda_d_inl0_hrd0

Since each directory would include either cuda or one cpp build, the runTest.exe executable would only include cuda or one cpp. That is to say, we would no longer have to worry about linking cuda anc cpp in the same executable. (This seems to defeta the purpose of using separate gpu and cpu namespaces in #725, but actually I think that we should still consider the possibility of having heterogeneous applications or fat binaries in the future... so let's keep separate namespaces while also using separate build directpries for cuda and cpp).

The CI tests would also need to change. Up until now in each job we were doing a make with no target that was building both cuda and cpp, and then a single runtest invokation where a single executable was running both in sequence. In teh new CI, each job would need to run in sequence make cuda and, for instance, make cppavx2. Then two separate instance of runTest would need to be called. NB in the CI, by default USEBUILDDIR=1 would be needed as two separate builds are needed.

The make -f cudacpp.mk avxall option would need to be renamed, eg to make -f cudacpp.mk ALL (or all, as long as it is not the first, default, target: the first, default, target would need to throw an error!). Presently avxall calls 5 avx builds, ending up in 10 executables (5 avxs, plus essentially the same cuda rebuilt 10 times). The new ALL target would need to call 6 builds, ie one avx build and one cuda build. Note, avxall presently uses USEBUILDDIR=1, this should also be used in the new ALL target. There are also some tricks implemented to allow parallel builds: these should be preserved, ie ``make -f cudacpp.mk ALL -j` should still be feasible with no errors.

SECOND PART: fortran makefile

The way the fortran makefile calls the cudacpp.mk can certainly be improved... but for the moment lets focus on just implementing the same stuff as above.

We now have (in different build directories) three targets $(PROG)_fortran, $(PROG)_cuda and $(PROG)_cpp. I would simply split the last one into 5 variants for the different avxs. Note that these will be the local directory plus 6 different build directories,

We should also replace the make -f makefile avxall by for instance make -f cudacpp.mk ALL with the same meaning as above. This would build all backends: I routinely need this for tests.

We now have three madevent_fortran_link, madevent_cuda_link, madevent_cpp_link. These only create a link to a SINGLE madevent file (all mg5amc scripts use this name) depending on the name. We must again split the cpp into five versions, and keep the logic of doing a symlink.

In the runcard, we also need to add the same mechanism, and split the cpp backend into 5 different ones. (@oliviermattelaer if you prefer the runcard could have a simdnone, simdsse4 etc naming as this is the onl bit that is user-facing... but in the makefile I would stick to cpp for simplicity).

The fortran makefile presently calls

$(CUDACPP_BUILDDIR)/.cudacpplibs:
	$(MAKE) -f $(CUDACPP_MAKEFILE)

Maybe this .cudacpplibs can disappear if we just internally use the various $(PROG)_cppnone etc? Otherwise, we need six different sections like

$(CUDA_BUILDDIR)/.cudalibs:
   $(MAKE) -f $(CUDACPP_MAKEFILE) cuda

$(CPPNONE_BUILDDIR)/.cppnonelibs:
   $(MAKE) -f $(CUDACPP_MAKEFILE) cppnone

Note that the build directories will have different names (so these names must all be dicovered as presently done for the single build dir).

Final point, note that $(PROG)_cppnone and the other 4 avx and the cuda will ALSO build fortran anyway? Or maybe this can be avoided, to be seen.

CONCLUSIONS

@Jooorgen I assign this to you... but let me know if you need help!

Anyone else, comments welcome!

PS Renaming as I had missed one o in Jorgen's username...

valassi · 2024-03-26T17:15:16Z

This is the main issue that is meant to be fixed in PR #798 (based on an earlier PR from Joorgen). It can be closed when that is merged. Thanks again @Jooorgen !

valassi mentioned this issue Jul 18, 2023

Segfault in testxxx runTest.exe for debug builds (need separate cpu/gpu namespaces) #725

Closed

valassi changed the title ~~Make cuda and cpp builds independent from each other~~ Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) Aug 17, 2023

valassi assigned Jooorgen Aug 17, 2023

valassi mentioned this issue Aug 17, 2023

Explicit make targets for cuda, cppnone, cppsse4 etc (avoid building CUDA code 5 times for each C++ AVX mode) #680

Closed

valassi linked a pull request Mar 26, 2024 that will close this issue

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Merged

valassi mentioned this issue Mar 26, 2024

Jorgen's makefiles with separate builds for CUDA, HIP and C++ #798

Merged

valassi closed this as completed in #798 May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

valassi commented Feb 27, 2023

valassi commented Jun 5, 2023

valassi commented Jul 19, 2023

valassi commented Aug 17, 2023 •

edited

Loading

valassi commented Mar 26, 2024

Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

Comments

valassi commented Feb 27, 2023

valassi commented Jun 5, 2023

valassi commented Jul 19, 2023

valassi commented Aug 17, 2023 • edited Loading

valassi commented Mar 26, 2024

valassi commented Aug 17, 2023 •

edited

Loading