-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up slow build times for ./configure based projects #555
Conversation
This is the import time profile after all optimizations. As you can see, most of the time on the benchbuild side is spent on importing packages related to the database. I am not too familiar with all the code but to me it seems that this can't really be reduced since the database is a core feature of benchbuild. |
Thanks for working on this! |
I never thought about it for long. But I think the sqlalchemy support can be removed "fairly" easily. The only real deep connection to the rest of benchbuild should be in The rest should work as you implemented it. Earlier versions of benchbuild relied on the standard pickle module. This relied on being able to have all necessary imports present in the unpickling python interpreter. |
I had a look again and it's definitely pretty easy to disable sqlalchemy support as well, thanks for your input. My idea is to add a switch for disabling the database in the benchbuild config. I am going to clean up my code so far and then push that into this PR as well. |
Everything necessary should be imported when loading (unpickling) either the project or the compiler.
Before, we always used the sqlite in-memory database to "disable" the database. Now, there is a proper configuration option for this. The advantage is that this can lead to significant performance improvements when benchbuild is invoked often, for example, in ./configure based projects. These kinds of projects invoke the wrapped compiler often to determine whether certain feature are available.
… without module name "pylint: Command line or configuration file:1: UserWarning: Specifying exception names in the overgeneral-exceptions option without module name is deprecated and support for it will be removed in pylint 3.0. Use fully qualified name (maybe 'builtins.Exception' ?) instead."
Force pushing really wasn't the call, anyway, now the damage is done. Adding the configuration option and the respective if guards for db/transaction code was pretty straightforward. Below is also the import time profile after all these optimizations. For the VaRA-Tool-Suite, we are now at a ~4-5x faster invocation time for the wrapped compiler. |
Ignore CI errors with doc steps. This is the ancient package 'pheasant' being out-of-date for python 3.10. I will replace it with sphinx in a future PR. |
- use C yaml loader - avoid unnecessary parsing in init_from_env()
Ok. There was one last thing that was looking weird in the profile and that was the amount of time spent in benchbuild.settings to parse the config. The reason for that is that yaml.load() is slow because yaml seems to be a complex language to parse. I was able to reduce the time necessary though by switching to the C-based yaml loader and by removing unecessary loads in This is the current state now. I don't see anything else weirdly taking a lot of time in the profile anymore. If you don't have any more ideas either, we can merge this from my side. |
the parent class yaml.CSafeLoader is implemented in C
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #555 +/- ##
==========================================
- Coverage 52.71% 50.01% -2.70%
==========================================
Files 120 120
Lines 8265 8305 +40
Branches 1038 1063 +25
==========================================
- Hits 4357 4154 -203
- Misses 3716 3960 +244
+ Partials 192 191 -1
... and 2 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
I have been investigating https://github.com/se-sic/VaRA/issues/1003. Concretely, I profiled the import time of the script
benchbuild/res/wrapping/run_compiler.py.inc
. I found that, at least for the VaRA-Tool-Suite, a substantial amount of time is being spent on importing plugins in the form of other projects and experiments which are not actually being used. So the main idea is to prevent that by overriding the benchbuild configuration through environment variables. Together with some other changes on VaRA's side, I managed to achieve about a 2x performance improvement in my tests.The other thing that might give a speed improvement in the future is to use
importlib.metadata
to query the installed benchbuild version instead ofpkg_resources
.pkg_resources
takes almost 100 ms on my machine to import, whileimportlib.metadata
is negligible. 1ae6fdb won't do anything at the moment though because there are at least two third-party packages that also usepkg_resources
to set__version__
.In my opinion, the change to
benchbuild/res/wrapping/run_compiler.py.inc
should be completely transparent to the user and the necessary imports are still made when unpickling the project/compiler. @vulder @boehmseb @simbuerg What do you think about this? Am I missing some detail where this might not work in certain cases?