Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate clang-format and other code beautifiers -> implement .clang-format #49

Closed
valassi opened this issue Nov 11, 2020 · 14 comments
Closed
Assignees

Comments

@valassi
Copy link
Member

valassi commented Nov 11, 2020

Hi @roiser @hageboeck @oliviermattelaer

as discussed with Stefan and Stephan this morning, it could be useful to set up some code beautifier for our code, and eventually if possible for the code autogenerated by MadGraph.

Stephan and Stefan suggested clang-format which seems a pretty standard solution these days. Otherwise some google tools also exist.

For reference, in the past I used a combination of uncrustify and emacs, but this was more than 5 years ago and better solutions certainly exist: https://gitlab.cern.ch/lcgcool/cool/-/blob/master/src/config/cmt/coolCppClean

We could also ask LHCb who certainly have something similar a s part of their post-commit hooks.

This could contain things like indentation, line length, removal of trailing spaces and of tabs, but also formatting choices like "(x" vs "( x" or inline-{ vs newline-{ and so on.

In my opinion, one of the main things to check is that the result of the tool is invariant under a new execution of the tool (which is sometimes not at all obvious to reach). We can then choose whether to use the tool manually, or use it to reject commits in post-commit hooks, or use it to fix commits in post-commit hooks in our madgraph4gpu.

For MadGraph itself, it would be nice to include this as a final step in the automatic code generation step if possible.

Just dumping a few ideas here for discussion. (I know Stephan is interested in looking at some options).

@lfield
Copy link
Contributor

lfield commented Nov 11, 2020

I think this is a very good idea and adding it as a post-commit hook is a good way to ensure it runs uniformly. It can be annoying if you are doing the right thing by using one but then have to clean up others code. In the past I used cpplint. I don't think it matters which one you chose, just that we choose one. The only issue I see is that the non-standard CUDA syntax may not be recognized, so that needs to be investigated. Whether you force this to work before merging is a policy decision but I don't see any reason why not.

@roiser
Copy link
Member

roiser commented Nov 12, 2020

@lfield has a good point, I guess we should check if e.g. the triple angular brackets work with the tool.

For what concerns the tools itself and also the syntax we decide on I'm quite indifferent. The only thing I would ask to consider is that the tool works with as many editors as possible that we may be using as developers and can be easily integrated. E.g. for atom I find this quite useful.

@lfield
Copy link
Contributor

lfield commented Nov 25, 2020

This seems interesting.
https://github.com/motine/cppstylelineup

@lfield
Copy link
Contributor

lfield commented Nov 25, 2020

clang-format and clang-tidy ?

@roiser roiser added the Epic Logical group of issues. If you don't use zenhub.com (automatically attached) you may ignore it :-) label Nov 26, 2020
@roiser roiser removed the Epic Logical group of issues. If you don't use zenhub.com (automatically attached) you may ignore it :-) label Dec 10, 2020
@valassi
Copy link
Member Author

valassi commented Mar 29, 2021

I am having a look at this again because I am comparing epoch2 and epoch1 (issue #139) and there are many formatting-related changes. I found that what I had done in the past is actually documented here https://twiki.cern.ch/twiki/bin/view/Persistency/CoolCodeConventions.

But indeed it seems better to look at more modern things like clang-format. I am trying to see what LHCb is doing. I think this looks like the current status, with links to presentations from 2018: https://lhcb-core-doc.web.cern.ch/lhcb-core-doc/Development.html#formatting-code-for-lhcb . This is using clang-format for C++ and yapf for python. I have not managed to use lb-format (because of various python2/python3 issues) from my LHCb environment, so I have not reverse-followed wjhat it is doing. Anyway, after source /cvmfs/lhcb.cern.ch/lib/LbLogin.sh, lb-format points to /cvmfs/lhcb.cern.ch/lib/lhcb/LBSCRIPTS/LBSCRIPTS_v9r2p6/InstallArea/scripts/lb-format. (On login for an LHCb account instead this is /cvmfs/lhcb.cern.ch/lib/var/lib/LbEnv/2003/stable/linux-64/bin/lb-format).

Independently, I browsed through the code, I imagine this could be the clang format it uses: https://gitlab.cern.ch/lhcb-core/LbDevTools/-/blob/master/LbDevTools/data/default.clang-format . Or alternatively (less likely) https://gitlab.cern.ch/lhcb-core/dev-tools/lb_dev_formatting/-/blob/master/.clang-format . These all look based on that from Gaudi: https://gitlab.cern.ch/gaudi/Gaudi/-/blob/master/.clang-format

As described in https://gitlab.cern.ch/gaudi/Gaudi/-/merge_requests/872 and https://gaudi.web.cern.ch/gaudi/releases/v32r0, Gaudi v32r0 used clang-format from the SFT cvmfs on /cvmfs/sft.cern.ch/lcg/releases/clang/8.0.0-ed577/x86_64-slc6/bin/clang-format (see https://gitlab.cern.ch/clemenci/Gaudi/-/blob/901d1f83bde95bad9b04c182b316b1d8916bb6b4/cmake/gaudi-clang-format-8)

This is the documentation from clang-format: https://clang.llvm.org/docs/ClangFormat.html

All this considered, I will try to just make a standalone clang-format script using the SFT clang8 installation oin cvmfs an dthe Gaudi format definition, and see what this gives...

@valassi
Copy link
Member Author

valassi commented Mar 30, 2021

I have merged PR #142.

It mainly contains a wrapper to find clang-format version 8 from cvmfs and a proposed style file, plus some other utilities.

It also contains earlier scripts I had developed (for the COOL project) based on uncrustify and emacs.

I suggest that clang-format should eventually become our reference tool.


A few more details:

  • I suggest using clang-format in the future to enforce coding conventions. This may even become a prerequisite for PRs through the CI. I suggest doing this from epoch3 onwards: essentially, we could format the MG generated code before committing the first version of epoch3. Eventually, this may even become included in upstream MG5aMC?

  • The version of clang-format is important. I suggest using version 8, the latest one installed on the SFT cvmfs. I have added a wrapper for that, mg-clang-format. The doc for version 8 is https://releases.llvm.org/8.0.0/tools/clang/docs/ClangFormat.html

  • The style file I committed is the best I could do so far to reflect my own preferences... It is based on the google style, with a few modifications. It is more spacious than compact (spaces in parenthesis, braces on newlines...) but allows one-liners. I set a column limit to 130, which is very large without being too much: for reference, google uses 80, Gaudi uses 120. Maybe eventually we could shrink back to 80.

  • For the moment I have not committed any files modified with clang-format, yet. We can always change the style file later if we agree on other suggestions... However I will probably start using this formatting on my eemumu work, also to ease the merging of epoch2 and epoch1 (issue Move latest eemumu developments from epoch1 to epoch2 ("merge" epoch2 into epoch1) #139, which a was bit the motivation to do this now).

  • I did quite a few tests. Mainly I checked that the result is stable. I added a few tools I used for tests. If others prefer, there are also online configurators, like https://zed0.co.uk/clang-format-configurator/, but I did not really use them.

  • It is easy to include this in emacs, I have added a slightly modified emacs file. The doc also says how to integrate it with vim and other editors.

  • I also added a few scripts for a tool I had developed 10 years ago, using uncrustify and emacs. I made a few fixes and upgrades. I would keep it in the repo, but I do not propose to use that. I just found it useful to do some tests and comparisons. One of the useful features of uncrustify is that it seems to have many more noop capabilities than clang-format. With clang-format, it is next to impossible to not modify the code one way or another and leave the existing style (see https://stackoverflow.com/a/33504513), while with uncrustify it was easy to just fix selected issues while leaving a lot of freedom.


Voila, that's all for now. What I suggest to keep as a todo:

  • please give your feedback if there are features you would very much like to change (by the way I might also suggest new changes, as I have only tested a limited part of the code)
  • we should use this in production before epoch3 and from then onwards
  • we should configure this in the CI
  • maybe this could even be ported upstream so that MG5aMC produces nicely formatted code out of the box?

@valassi valassi self-assigned this Mar 30, 2021
@valassi valassi changed the title Investigate code beautifier options Investigate clang-format and other code beautifiers Mar 30, 2021
@valassi
Copy link
Member Author

valassi commented Mar 30, 2021

The style file is https://github.com/madgraph5/madgraph4gpu/blob/master/.clang-format

Note, I did not copy all of the google style, because it explicitly derives from that. I have added a copy of the version 9 google format here, anyway: https://github.com/madgraph5/madgraph4gpu/blob/master/tools/mg-clang-format/.clang-format.google

@valassi
Copy link
Member Author

valassi commented Oct 21, 2021

While I did spend time investigating clang-format and even created a candidate style file, I was never really happy with the result. This is because clang-format does not allow a no-op mode, where you only selectively change the few features you want to change.

I have now completed the python code generator for vectorized code in issue #244 and all related PRs. One important side effect of this work is that the auto-generated code is now nicely formatted and indented. The code generator IS also a code beautifier, and is very precisely programmable, much more precisely than clang-format or uncrustify. This means that we should not need clang-format, uncrustify or other tools (issue #49).

The way this was implemented was to disable the "CPPFileWriter" in madgraph, using the non-formatted FileWriter, and then adding formatting in all python fragments that write C++ fragments. The plugin still needs some cleanup, but personally I think that this is a good solution.

Note that, before merging the final epochX3, I did check with the LHCb-style emacs M-x clean that all .h and .cc files were stable, ie formatted according to those LHCb guidelines. See 5f4c2b9 . This is only basic indentation, but does not check if you leave spaces before/after parenthesis or which style of braces (inline, next line etc) you use. This is done on a case by case basis.

I am closing this issue. The only thing that could remain to be done is to remove the .clang-format file, but for the moment I would leave it there.

@valassi valassi closed this as completed Oct 21, 2021
@valassi
Copy link
Member Author

valassi commented Feb 2, 2022

Reopening this, as @roiser has mentioned this again.

I repeat my conclusion "While I did spend time investigating clang-format and even created a candidate style file, I was never really happy with the result. This is because clang-format does not allow a no-op mode, where you only selectively change the few features you want to change."

To add some content, I especially dislike the idea of having to apply the same exact rules everywehere. Example, the AOSOA formulas are much more readable as "buffer[ipagRnparfnp4neppR + iparfnp4neppR + ip4neppR + ieppR];" then putting either a space everywhere or no space anywhere. The clang-format tool has no flexibility in this. I would avoid it.

return buffer[ipagR*nparf*np4*neppR + iparf*np4*neppR + ip4*neppR + ieppR]; // AOSOA[ipagR][iparf][ip4][ieppR]

The automatic code generation IS formatted through the python generation, but essentially it is formatted according to my own taste - which I understand is a totally personal choice and others may have other ideas. Maybe if you have specific suggestions on how to format some code differently, lets discuss it?

But I vote no for clang-format.

@roiser
Copy link
Member

roiser commented Feb 4, 2022

This thread can become very easily long, I would therefore just put my comment in here and leave it there.

I think if we want to go in the direction of code formatting we have to depend on a set of rules which need to be reproducible for all the code. In my case if I would leave code non auto-formatted I'm sure it will become incoherent in no time and my main aim of having a code formatting tool would be to have all code set out in a predictable way, which I think can only be done via a tool for which we setup a set of rules and let it do its job.

I think this somewhat conflicts with the statements above but as this should not slow us down in code development, as said, I would leave it there and stop discussing this.

@valassi
Copy link
Member Author

valassi commented Feb 4, 2022

Hi Stefan, thanks for your patience :-)

I completely agree, lets leave it aside for now, but maybe for later we can think about it, with clang format or maybe another tool.

Just to try and pinpoint some examples, take arithmetic operators for "a=b+c" (just to keep it simple, even if it is a silly example). Two approaches are possible about what you want to achieve:

  1. You want that ALL occurrences of "a=b+c" are ALWAYS formatted in the same way, for instance either no-space "a=b+c" or spaced "a = b + c". Then clang-format is your friend.
  2. You want the flexibility to have the human (and subjective) appreciation that sometimes it makes sense to write "a=b+c" and sometimes it makes sense to write "a = b + c". Then clang-format is not your friend.

I want option 2, and IIUC you want option 1. I think that my goal is to achieve "readable" code (subjective definition), and "rules" and tools are there to help, they are a means, not a goal. My goal is certainly not "respecting the rule" for the sake of respecting the rule.

Phrasing this otherwise: there are tools other than clang-format that allow you to impose only specific rules (eg indentation), while leaving other bits of the code untouched. This is a mixed world where for some things you have a strict rule (option 1) and for other things you have option 2 (up to the developer to use his/her judgement). I would like something like that (I am using something like that, already).

Anyway, lets do the coding now, and maybe rediscuss this later on...!

@roiser
Copy link
Member

roiser commented Feb 8, 2022

Thanks @valassi, also from our discussion yesterday in the Madgraph meeting I gathered that we could try to look into "some" config of clang-format and try to apply it. Also with exceptions in the source code where formatting doesn't make sense. --> Low prio !!

@valassi
Copy link
Member Author

valassi commented Feb 24, 2022

Ok I now have a full prototype of .clang-format in WIP PR #388.

I would say that the .clang-format is practically final for my tastes (maybe a few tweaks here an dthere). I need to backport all to code generation, regenerate all processes, and then also integrate clang-format itself into code generation.

For the latter point, I plan to use clang-format to CHECK that generated code is already well formatted, NOT TO FORMAT IT. It may seem an overkill, but it looks more solid to me. We do not rely on clang format in code generation, this way.

@valassi
Copy link
Member Author

valassi commented Mar 3, 2022

I have completed PR #388 and I am about to merge it.

Fron now on, epochX/cudacpp code generation includes a .clang-format and checks that the auto generated code conforms with it. Note that code generation does not apply clang-format: it produces code that is already supposed to comply.

The current .clang-format is here
https://github.com/madgraph5/madgraph4gpu/blob/09e482e30f473257fcceba8325ed16c525e0e084/epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/.clang-format

Closing thi sissue as completed.

@valassi valassi closed this as completed Mar 3, 2022
@valassi valassi changed the title Investigate clang-format and other code beautifiers Investigate clang-format and other code beautifiers -> implement .clang-format Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants