Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -Denable_marchnative=true option #6029

Closed
wants to merge 4 commits into from

Conversation

rurban
Copy link
Contributor

@rurban rurban commented Apr 13, 2021

Valid for gcc-compat on Intel. Not valid on arm or other archs.

And pass CFLAGS and CXXFLAGS to cudd configure.
Don't use that on releases, just privately or on known CPU HW.
Added a doc section Maximum Performance in compilation-and-development

10% faster than -O3.
make test cbmc-CORE went from 84.02.sec with -O3 to 75.92 sec with -O2 -march=native

test regression: SIMD1
no body for function __builtin_ia32_vec_ext_v4si
main.c function main
[main.assertion.1] line 12 assertion val1 == 0x1234: FAILURE

See #6028

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@rurban rurban requested a review from a team as a code owner April 13, 2021 10:17
Copy link
Contributor

@TGWDB TGWDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a mixture of at least four different things.

  1. Adding the titular option related to native architecture optimisations. This mostly looks ok although I'd like to see some more details on the data/improvement on more than one run on one system.
  2. There are changes to flag passing (for CaDiCaL) which seems like a bugfix? Can this also be checked to see if this is only cmake, or also a make issue?
  3. Improvements to output during cmake build.
  4. A modification related to location and changes to version.cpp that is not clear needs to be fixed. Can this be explained/justified?

There may also be some changes to formatting that are cosmetic and unrelated?

I would like these to at least be clearly explained and separated, if not split into different PRs.

@rurban
Copy link
Contributor Author

rurban commented Apr 13, 2021

Adding the titular option related to native architecture optimisations. This mostly looks ok although I'd like to see some more details on the data/improvement on more than one run on one system.

I did several runs, compared to -O3. Before I only benchmarked private code, now I've benchmarked against cbmc-CORE.

There are changes to flag passing (for CaDiCaL) which seems like a bugfix? Can this also be checked to see if this is only cmake, or also a make issue?

We needed to pass extra compiler options to all extra compiled backends, such as cudd or CaDiCaL. cudd already has -O3, but for -march=native we'd need to add these lines. More so for -flto later. You won't get any SIMD vectorization without.

Improvements to output during cmake build.

Only for this feature.

A modification related to location and changes to version.cpp that is not clear needs to be fixed. Can this be explained/justified?

Sorry, I've split the version fix into a seperate branch. I forgot if I already submitted that as extra PR already, if not will do. There was some denial, that it is not broken.

@codecov
Copy link

codecov bot commented Apr 13, 2021

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1068770) 79.08% compared to head (2e76918) 74.11%.
Report is 5 commits behind head on develop.

❗ Current head 2e76918 differs from pull request most recent head 0d22049. Consider uploading reports for the commit 0d22049 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6029      +/-   ##
===========================================
- Coverage    79.08%   74.11%   -4.98%     
===========================================
  Files         1696     1444     -252     
  Lines       196505   157389   -39116     
===========================================
- Hits        155408   116646   -38762     
+ Misses       41097    40743     -354     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rurban rurban force-pushed the march-native branch 4 times, most recently from 8022c13 to b567533 Compare August 27, 2021 06:57
@rurban rurban force-pushed the march-native branch 3 times, most recently from 086aa46 to cb5abcf Compare September 23, 2021 05:49
@rurban rurban force-pushed the march-native branch 2 times, most recently from a263011 to a58921d Compare April 7, 2022 11:15
@rurban rurban force-pushed the march-native branch 6 times, most recently from 1cdcb65 to a30e6d8 Compare October 4, 2023 06:04
@rurban rurban force-pushed the march-native branch 5 times, most recently from 6cb69cb to 6870815 Compare October 11, 2023 06:04
@rurban rurban force-pushed the march-native branch 5 times, most recently from 769c438 to fa2a35b Compare October 19, 2023 06:04
@rurban rurban force-pushed the march-native branch 3 times, most recently from 70b918c to f220ef0 Compare October 26, 2023 06:08
@rurban rurban force-pushed the march-native branch 2 times, most recently from 94fc848 to cb73fbc Compare November 10, 2023 08:50
rurban and others added 4 commits December 20, 2023 08:04
LTO and -O3 give a good performance improvement.
I would not recommend -O3 with bad compiler versions though, such
as gcc-9.

WIP: GLOBAL properties are not enough, add the IPO property to all
targets. Which leads to various problems still.
See the next commit to fix yyalloc name collisions.
This is for CMake only as we haven't currently got an LTO set-up for
Makefile builds.
not valid on arm or other archs. Only for gcc-compatible compilers.
30% faster on small samples than -flto
@rurban
Copy link
Contributor Author

rurban commented Dec 21, 2023

Will change lto fixes only

@rurban rurban closed this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants