Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix match function call slowness at scale #62295

Merged
merged 5 commits into from
Sep 22, 2022

Conversation

nicholasmhughes
Copy link
Collaborator

What does this PR do?

This PR changes match functions from loader access to direct imports of the matchers in order to yield much faster results.

What issues does this PR fix or reference?

Fixes: #62283

Previous Behavior

See issue for details.

New Behavior

See issue for details.

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

Commits signed with GPG?

Yes

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

@nicholasmhughes nicholasmhughes requested a review from a team as a code owner July 6, 2022 22:39
@nicholasmhughes nicholasmhughes requested review from whytewolf and removed request for a team July 6, 2022 22:39
whytewolf
whytewolf previously approved these changes Jul 6, 2022
@nicholasmhughes
Copy link
Collaborator Author

Question for the class...

Why would log.debug() be a (somewhat relatively) expensive operation when not in DEBUG and using string interpolation with standard types?

Removing some debug lines in this branch further reduces run time from:

$ python match_test.py -n 100000
6.784708013990894

to:

$ python match_test.py -n 100000
1.6946497799362987
Diff for reference
diff --git a/salt/matchers/compound_match.py b/salt/matchers/compound_match.py
index f2be8a5b57..a4068932d3 100644
--- a/salt/matchers/compound_match.py
+++ b/salt/matchers/compound_match.py
@@ -43,7 +43,7 @@ def match(tgt, opts=None, minion_id=None):
     if not isinstance(tgt, str) and not isinstance(tgt, (list, tuple)):
         log.error("Compound target received that is neither string, list nor tuple")
         return False
-    log.debug("compound_match: %s ? %s", minion_id, tgt)
+    #log.debug("compound_match: %s ? %s", minion_id, tgt)
     ref = {
         "G": grain_match.match,
         "P": grain_pcre_match.match,
@@ -119,7 +119,7 @@ def match(tgt, opts=None, minion_id=None):
             results.append(str(glob_match.match(word, opts, minion_id)))
 
     results = " ".join(results)
-    log.debug('compound_match %s ? "%s" => "%s"', minion_id, tgt, results)
+    #log.debug('compound_match %s ? "%s" => "%s"', minion_id, tgt, results)
     try:
         return eval(results)  # pylint: disable=W0123
     except Exception:  # pylint: disable=broad-except
diff --git a/salt/matchers/grain_match.py b/salt/matchers/grain_match.py
index 0dd541b3e2..f71334183d 100644
--- a/salt/matchers/grain_match.py
+++ b/salt/matchers/grain_match.py
@@ -17,7 +17,7 @@ def match(tgt, delimiter=DEFAULT_TARGET_DELIM, opts=None, minion_id=None):
     if not opts:
         opts = __opts__
 
-    log.debug("grains target: %s", tgt)
+    #log.debug("grains target: %s", tgt)
     if delimiter not in tgt:
         log.error("Got insufficient arguments for grains match statement from master")
         return False
diff --git a/salt/utils/data.py b/salt/utils/data.py
index 0b3634d842..002199527d 100644
--- a/salt/utils/data.py
+++ b/salt/utils/data.py
@@ -972,12 +972,12 @@ def subdict_match(
         else:
             matchstr = delimiter.join(splits[idx:])
             match = traverse_dict_and_list(data, key, {}, delimiter=delimiter)
-        log.debug(
-            "Attempting to match '%s' in '%s' using delimiter '%s'",
-            matchstr,
-            key,
-            delimiter,
-        )
+        #log.debug(
+        #    "Attempting to match '%s' in '%s' using delimiter '%s'",
+        #    matchstr,
+        #    key,
+        #    delimiter,
+        #)
         if match == {}:
             continue
         if isinstance(match, dict):

@waynew
Copy link
Contributor

waynew commented Jul 11, 2022

Question for the class...

Why would log.debug() be a (somewhat relatively) expensive operation when not in DEBUG and using string interpolation with standard types?

Without digging into the code... my recollection is that in the Python logging module that it should not do string interpolation if the logging is not enabled for that level. In theory it should bail out. You could also check the time with log calls that aren't using any interpolation. But I'm pretty sure that what's happening is just that it's incurring the cost of the function calls, which is non-zero (and in fact, one of the most expensive things to do in Python).

@nicholasmhughes
Copy link
Collaborator Author

I dug into this a little further last week and benchmarked the salt logging class against the default in a super simple test case.

nick_py_logger.py
import argparse
import logging
import timeit

log = logging.getLogger(__name__)


def main() -> None:
    parser = argparse.ArgumentParser()
    parser.add_argument("-n", type=int, default=1000)
    args = parser.parse_args()

    one = "monkey"
    two = ["donkey"]

    print(
        timeit.timeit(
            lambda: log.debug("stuff: %s ? %s", one, two),
            number=args.n,
        )
    )


if __name__ == "__main__":
    main()
$ python nick_py_logger.py -n 100000
0.01669288403354585
nick_salt_logger.py
import argparse
import logging
import timeit

from salt._logging.impl import (SaltLoggingClass, PROFILE, TRACE, QUIET, GARBAGE, setup_temp_handler, get_temp_handler)


if logging.getLoggerClass() is not SaltLoggingClass:

    # Import pip._internal which itself will install it's own custom logging handler
    # we want to override that handler with ours
    try:
        import pip._internal.utils._log as pip_log_module  # pylint: disable=no-name-in-module,import-error
    except ImportError:
        pip_log_module = None

    logging.setLoggerClass(SaltLoggingClass)
    logging.addLevelName(QUIET, "QUIET")
    logging.addLevelName(PROFILE, "PROFILE")
    logging.addLevelName(TRACE, "TRACE")
    logging.addLevelName(GARBAGE, "GARBAGE")
    if pip_log_module is not None:
        # Let's make newer versions of pip work by patching SaltLoggingClass to
        # add a verbose method which is what pip expects
        SaltLoggingClass.verbose = SaltLoggingClass.debug

    if not logging.root.handlers:
        # No configuration to the logging system has been done so far.
        # Set the root logger at the lowest level possible
        logging.root.setLevel(GARBAGE)

        # Add a permanent null handler so that we never get messages like:
        #   No handlers could be found for logger 'foo'
        setup_temp_handler()
        logging.root.addHandler(get_temp_handler())


# Now that we defined the default logging logger class, we can instantiate our logger
# DO NOT MOVE THIS
log = logging.getLogger(__name__)


def main() -> None:
    parser = argparse.ArgumentParser()
    parser.add_argument("-n", type=int, default=1000)
    args = parser.parse_args()

    one = "monkey"
    two = ["donkey"]

    print(
        timeit.timeit(
            lambda: log.debug("stuff: %s ? %s", one, two),
            number=args.n,
        )
    )


if __name__ == "__main__":
    main()
$ python nick_salt_logger.py -n 100000
0.6871055489173159

40x slower just seems excessive for a wrapper that adds some basic info and then offloads to the default logging class method to do the bulk of the work.

@OrangeDog
Copy link
Contributor

OrangeDog commented Jul 12, 2022

In general, Python is slow. Repeating the level check before adding "some basic info" may speed it up a bit, but it's hard to predict.

From a quick look, there's a lot of string formatting in there, context lookups, byte decoding, all to create the record before the level is checked.

@Ch3LL
Copy link
Contributor

Ch3LL commented Jul 14, 2022

ping @cro any chance you can review here. I know you initially added the feature to allow matchers to be loaded by Salt's loader.

@Ch3LL Ch3LL requested review from cro and garethgreenaway July 14, 2022 17:16
@nicholasmhughes nicholasmhughes added the Sulfur v3006.0 release code name and version label Jul 15, 2022
@cro
Copy link
Contributor

cro commented Jul 20, 2022

This will prevent the loading of any custom matchers completely, only the matchers hardcoded will be permitted. If any Salt users depend on custom matchers (like overloading the glob matcher with different semantics that fit their environment better, and letting the loader system distribute the module), this PR will break their setup.

I'm not specifically against the idea of importing them directly for a performance boost, is there a way to do this while keeping the ability to load custom matchers?

@nicholasmhughes
Copy link
Collaborator Author

The only way I can think of to solve that problem is to create __matchers__ as global dunder and handle it like we do with __salt__. That way, it'd only be loaded up once instead of during each call to the function. Thoughts on this approach?

I mocked up something this morning to test out the idea, and it seems to work. However, pytest-salt-factories will probably have to recognize __matchers__ as a legitimate dunder in order to ease testing of the functions with this new setup. @s0undt3ch , can you confirm?

@nicholasmhughes
Copy link
Collaborator Author

👆 @cro @garethgreenaway @s0undt3ch

@cro
Copy link
Contributor

cro commented Aug 4, 2022

have to recognize matchers as a legitimate dunder in order to ease testing of the functions with this new setup.
@s0undt3ch , can you confirm?

This will be helpful, and I thought it had already been done, but checking the code reveals it's not there. It's not hard to do, but it is a fair amount of code to modify.

@s0undt3ch
Copy link
Collaborator

The only way I can think of to solve that problem is to create __matchers__ as global dunder and handle it like we do with __salt__. That way, it'd only be loaded up once instead of during each call to the function. Thoughts on this approach?

I mocked up something this morning to test out the idea, and it seems to work. However, pytest-salt-factories will probably have to recognize __matchers__ as a legitimate dunder in order to ease testing of the functions with this new setup. @s0undt3ch , can you confirm?

Yes, we would need to update salt-factories.

@s0undt3ch
Copy link
Collaborator

I'd prefer not to create another dunder global if we can avoid it.

@Ch3LL
Copy link
Contributor

Ch3LL commented Aug 4, 2022

@s0undt3ch can you elaborate why you would not want to do this? This would ensure we are not invoking the loader on each matcher call, which would mean more speed.

@nicholasmhughes
Copy link
Collaborator Author

@s0undt3ch , do you have an alternative? ☝️

@nicholasmhughes
Copy link
Collaborator Author

All, I moved to storing the matchers in __context__, so they should only be loaded once. It's not quite as fast... but should allow for the matcher overrides to continue as before. Please let me know if this method is acceptable.

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-freebsd-131-amd64-py3-pytest

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-centos-7-x86_64-py3-tcp-pytest

@cro
Copy link
Contributor

cro commented Sep 21, 2022

Thank you @nicholasmhughes ! I'm fine with that, good idea.

whytewolf
whytewolf previously approved these changes Sep 21, 2022
@Ch3LL
Copy link
Contributor

Ch3LL commented Sep 21, 2022

There is a recent merge conflict that needs to be resolved.

@Ch3LL Ch3LL added the merge-conflict PR has a merge conflict label Sep 21, 2022
@nicholasmhughes
Copy link
Collaborator Author

Fixed

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-ubuntu-2004-amd64-py3-pycryptodome-pytest

@nicholasmhughes
Copy link
Collaborator Author

re-run pr-ubuntu-2004-amd64-py3-pycryptodome-pytest

@Ch3LL Ch3LL merged commit 981d861 into saltstack:master Sep 22, 2022
@nicholasmhughes nicholasmhughes deleted the fix-matcher-slowness branch September 22, 2022 19:32
frebib pushed a commit to frebib/salt that referenced this pull request Oct 7, 2022
This PR changes match functions from loader access to direct imports of
the matchers in order to yield much faster results.

(cherry picked from commit ce20510)

Backported-from: saltstack#62295
Fixes: saltstack#62283
Signed-off-by: Joe Groocock <jgroocock@cloudflare.com>
@cmcmarrow
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sulfur v3006.0 release code name and version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] match function calls are slow at scale
10 participants