Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) #602

Closed
valassi opened this issue Feb 27, 2023 · 4 comments · Fixed by #798
Assignees

Comments

@valassi
Copy link
Member

valassi commented Feb 27, 2023

Thisis ajust a reminder for myself in cudacpp.

In any normal build with a fixed AVX value we are now building together the cuda and cpp gcheck/check.exe, gCPPProcess.o/CPPProcess.o, and a single runTest.exe.

There are many reasons why we should find a better way

  • there is no way to build only cuda (while there is a simple way to disable cuda with CUDA_HOME=invalid), this may be interesting for instance when gcc fails for some large processes (see WIP: gg to ttgggg (2->6 process) #601)
  • my avxall tests/builds now essentially rebuild cuda five times, which is silly
  • the runtest.exe is a same executable with different meaning depending on whether cuda exists or not (maybe use a runTest.sh script that calls two separate execuatbles instead or even three, common, cuda, cpp?)
  • eventually we will probably move to fat libraries
  • we need to repackage everything for full madeven integration of "launch" (Workplan (December 2022) towards a first alpha release #576)...
@valassi
Copy link
Member Author

valassi commented Jun 5, 2023

I had forgotten this one. This is very much needed. Related to MR #681 (merged, decrease cpp build dependency on nvtx and curand) and #674 (separate madevent fortran/cuda/cpp builds) and #680 (avoid rebuilding cuda 5 times).

So this is very much a current issue..

@valassi
Copy link
Member Author

valassi commented Jul 19, 2023

Note: in MR #723 fixing #725, I improved the separation of cpu and gpu namespaces (so now it is a bit sfare to mix the two codes... though it is maybe a better idea not to do that anyway).

@valassi valassi changed the title Make cuda and cpp builds independent from each other Explicit make targets for cuda, cppnone, cppsse4 etc (make cuda and cpp builds independent from each other) Aug 17, 2023
@valassi
Copy link
Member Author

valassi commented Aug 17, 2023

Hi @roiser @Jooorgen this issue #602 (together with #680) is almost spot on with the things that we just discussed. So, rather than opening a new one, I will document here our discussion. Also cc @hageboeck @zeniheisser @oliviermattelaer

The actual motivation of our discussion was about adding a HIP target to the builds and increase the control over it (in Jorgen's present implementation, HIP libraries are built only if a hip compiler exists but the cuda compiler does not exist). However, the strategy we discussed can/should be implemented even bore HIP is introduced. (Therefore I suggest creating a MR on upstream/master that only deals with cpp and cuda... then adding a hip target will become trivial).

OVERVIEW

The idea is that

  • make with no target should no longer do anything, it can give an error: you must specify a target
  • this would apply both in the cudacpp and the fortran makefiles

FIRST PART: cudacpp.mk

(Eventually, we might think of replacing our 'none', 'sse4' names etc by something more portable to ARM etc. For the moment, lets stick to those.)

(Also, we discussed whether to use specific x86 targets, or even the new v1/v2 of https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels, I would avoid this for now as we will not do any cross compilation of arm from x86 etc).

This means that for cudacpp.mk we should support these targets

  • make -f cudacpp.mk cuda
  • make -f cudacpp.mk cppnone
  • make -f cudacpp.mk cppsse4
  • make -f cudacpp.mk cppavx2
  • make -f cudacpp.mk cpp512y
  • make -f cudacpp.mk cpp512z

These different targets would build ONLY either cuda or a specific AVX build.

The AVX variable should disappear from the makefile as it is absorbed in the target. (Presently we run make -f cudacpp.mk AVX=sse4 for instance, with no explicit target but with an AVX variable, this will become make -f cudacpp.mk cppsse4).

The selection of the "best" AVX for a specific target should also disappear from the makefile (unless we think for instance of adding make -f cudacpp.mk cppbest that then invokes one of the five options).

Each build of cuda or one cpp would go to one directory. For the moment, let's keep USEBUILDDIR as optional and supported. Whether in the local directory (USEBUILDDIR=0) or in a build specific build dir (USEBUILDDIR=1), lets keep the lockfile preventing the mixing of builds:

  • .build.512y_d_inl0_hrd0_hasCurand however would become .build.cpp512y_d_inl0_hrd0_hasCurand or .build.cuda_d_inl0_hrd0_hasCurand (or maybe we can keep the 5 existing names the same, but we must add one for cuda!)
  • similarly, the build directories (previously build.512y_d_inl0_hrd0 included both cuda and cpp) would become build.cpp512y_d_inl0_hrd0 or build.cuda_d_inl0_hrd0

Since each directory would include either cuda or one cpp build, the runTest.exe executable would only include cuda or one cpp. That is to say, we would no longer have to worry about linking cuda anc cpp in the same executable. (This seems to defeta the purpose of using separate gpu and cpu namespaces in #725, but actually I think that we should still consider the possibility of having heterogeneous applications or fat binaries in the future... so let's keep separate namespaces while also using separate build directpries for cuda and cpp).

The CI tests would also need to change. Up until now in each job we were doing a make with no target that was building both cuda and cpp, and then a single runtest invokation where a single executable was running both in sequence. In teh new CI, each job would need to run in sequence make cuda and, for instance, make cppavx2. Then two separate instance of runTest would need to be called. NB in the CI, by default USEBUILDDIR=1 would be needed as two separate builds are needed.

The make -f cudacpp.mk avxall option would need to be renamed, eg to make -f cudacpp.mk ALL (or all, as long as it is not the first, default, target: the first, default, target would need to throw an error!). Presently avxall calls 5 avx builds, ending up in 10 executables (5 avxs, plus essentially the same cuda rebuilt 10 times). The new ALL target would need to call 6 builds, ie one avx build and one cuda build. Note, avxall presently uses USEBUILDDIR=1, this should also be used in the new ALL target. There are also some tricks implemented to allow parallel builds: these should be preserved, ie ``make -f cudacpp.mk ALL -j` should still be feasible with no errors.

SECOND PART: fortran makefile

The way the fortran makefile calls the cudacpp.mk can certainly be improved... but for the moment lets focus on just implementing the same stuff as above.

We now have (in different build directories) three targets $(PROG)_fortran, $(PROG)_cuda and $(PROG)_cpp. I would simply split the last one into 5 variants for the different avxs. Note that these will be the local directory plus 6 different build directories,

We should also replace the make -f makefile avxall by for instance make -f cudacpp.mk ALL with the same meaning as above. This would build all backends: I routinely need this for tests.

We now have three madevent_fortran_link, madevent_cuda_link, madevent_cpp_link. These only create a link to a SINGLE madevent file (all mg5amc scripts use this name) depending on the name. We must again split the cpp into five versions, and keep the logic of doing a symlink.

In the runcard, we also need to add the same mechanism, and split the cpp backend into 5 different ones. (@oliviermattelaer if you prefer the runcard could have a simdnone, simdsse4 etc naming as this is the onl bit that is user-facing... but in the makefile I would stick to cpp for simplicity).

The fortran makefile presently calls

$(CUDACPP_BUILDDIR)/.cudacpplibs:
	$(MAKE) -f $(CUDACPP_MAKEFILE)

Maybe this .cudacpplibs can disappear if we just internally use the various $(PROG)_cppnone etc? Otherwise, we need six different sections like

$(CUDA_BUILDDIR)/.cudalibs:
   $(MAKE) -f $(CUDACPP_MAKEFILE) cuda

$(CPPNONE_BUILDDIR)/.cppnonelibs:
   $(MAKE) -f $(CUDACPP_MAKEFILE) cppnone

Note that the build directories will have different names (so these names must all be dicovered as presently done for the single build dir).

Final point, note that $(PROG)_cppnone and the other 4 avx and the cuda will ALSO build fortran anyway? Or maybe this can be avoided, to be seen.

CONCLUSIONS

@Jooorgen I assign this to you... but let me know if you need help!

Anyone else, comments welcome!

PS Renaming as I had missed one o in Jorgen's username...

@valassi
Copy link
Member Author

valassi commented Mar 26, 2024

This is the main issue that is meant to be fixed in PR #798 (based on an earlier PR from Joorgen). It can be closed when that is merged. Thanks again @Jooorgen !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants