Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement] Improve performance of module backend when detecting module conflicts #3263

Merged
merged 3 commits into from
Sep 23, 2024

Conversation

brandongc
Copy link
Contributor

Running a typical subset of NERSC's test suite with ~500 tests, there are ~30 unique values passed to this routine and the result of call for a given input is static. module commands can take a surprisingly long amount of time to return.

This change reduces the time for

reframe -C ./nersc-config.py -c checks --dry-run --mode checkout

from ~735s to <600s.

@brandongc
Copy link
Contributor Author

It looks like some versions of python do not like the syntax.

Also, technically this introduces a potential memory leak as the cache is global and keyed on all of the arguments to the function including self. It doesn't look like this class gets instantiated more than once in normal use, but could be an issue.

A less concise but straightforward approach to achieve the same effect:

 import abc
 import os
 import re
-import functools
 from collections import OrderedDict

 import reframe.core.fields as fields
@@ -947,6 +946,7 @@ class LModImpl(TMod4Impl):
                               'this Lmod installation')

         self._extra_module_paths = []
+        self._conflicted_modules_cache = {}

     def name(self):
         return 'lmod'
@@ -978,18 +978,19 @@ class LModImpl(TMod4Impl):

         return ret

-#    @functools.lru_cache(maxsize=None)
     def conflicted_modules(self, module):
-        fp = "/global/u1/c/cookbg/reframe-nersc-tests/conflict_module_args.txt"
-        with open(fp, "a") as f:
-            f.write(str(module) + "\n")
         if module.collection:
             # Conflicts have no meaning in module collection. The modules
             # system will take care of these when restoring a module
             # collection
             return []

-        output = self.execute_with_path('show', str(module), path=module.path)
+        module_name = str(module)
+        if module_name in self._conflicted_modules_cache.keys():
+            output = self._conflicted_modules_cache[module_name]
+        else:
+            output = self.execute_with_path('show', module_name, path=module.path)
+            self._conflicted_modules_cache[module_name] = output

         # Lmod accepts both Lua and and Tcl syntax
         # The following test allows incorrect syntax, e.g., `conflict

reframe/core/modules.py Outdated Show resolved Hide resolved
The variant of the decorator without arguments was added in Python 3.8.
@vkarak vkarak changed the title [feat] Cache conflicted_modules results [enhancement] Improve performance of module backend when detecting module conflicts Sep 23, 2024
@vkarak vkarak merged commit f63e040 into reframe-hpc:develop Sep 23, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants