Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation after creating many classes with the same name #826

Closed
MrCocoDev opened this issue Jun 15, 2021 · 28 comments
Closed

Comments

@MrCocoDev
Copy link

I suspect this is heavily related to #575, but my analysis suggests the issues showed up in 19.2.0 (2019-10-01) and that issue was created before the release went live.

I made a simple test harness to demonstrate the issue:

def create_class():
    @attr.s
    class MyClass(object):
        access_token = attr.ib()
        expires = attr.ib()
        refresh_token = attr.ib()
        user = attr.ib()
    
    return MyClass(
        access_token="string",
        expires=0,
        refresh_token="string",
        user="string",
    )
%timeit create_class()

I ran this over attrs versions: 18.2, 19.1, 19.2, 20.1, 21.1, and 21.2.

I collected the outputs:

[
    # Version num, timeit output, whether or not timeit warned for caching
    (18.2, "358 µs ± 4.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)", False),
    (19.1, "356 µs ± 1.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)", False),
    (19.2, "22 ms ± 9.5 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)", "The slowest run took 4.71 times longer than the fastest"),
    (20.1, "22.4 ms ± 9.85 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)", "The slowest run took 4.74 times longer than the fastest"),
    (21.1, "22.9 ms ± 9.74 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)", "The slowest run took 5.12 times longer than the fastest"),
    (21.2, "23.4 ms ± 10.5 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)", ""The slowest run took 4.91 times longer than the fastest"),
]

This suggests that something that went along with the 19.2 release is causing a performance leak. We noticed the issue because our CPU metrics started to spike when we upgraded attrs from version 18.2 directly to 21.2.

@euresti
Copy link
Contributor

euresti commented Jun 15, 2021

Looks like this change caused it (using git bisect):

eda9f2de6f7026039c3eee80297f0c95382d51d8 is the first bad commit
commit eda9f2de6f7026039c3eee80297f0c95382d51d8
Author: David Euresti <david@euresti.com>
Date:   Thu Aug 1 03:42:13 2019 -0700

    Make unique_filename not depend on repr. (#560)
    
    * Make unique_filename not depend on repr.
    
    For debugging purposes, when a method is created it needs to be put
    into a "filename".  This filename was generated using `repr(attrs)`
    however this meant that the filename could change every time the
    program was loaded depending on the whims of repr.  (In particular
    free functions and lambdas would include ids)
    
    This solves the problem by changing the name to be <attrs generated
    {method} {module}.{qualname}> If it should happen that the name is
    taken then -2 will be appended and incremented until a free filename
    is found.
    
    * Add tests
    
    * Fix tests and changelog

 changelog.d/560.change.rst |  2 ++
 src/attr/_make.py          | 66 ++++++++++++++++++++++++++++++++--------------
 tests/test_dunders.py      | 46 ++++++++++++++++++++++++++++++++
 3 files changed, 94 insertions(+), 20 deletions(-)
 create mode 100644 changelog.d/560.change.rst

@hynek
Copy link
Member

hynek commented Jun 15, 2021

That’s highly unfortunate and we probably should finally make benchmarks part of our workflow.

If someone is really good at profiling, I’m all eyes.


That said, I don’t think #575 is related to your benchmark (it talks only about creation of classes, not instantiation) and I also think it was more of a general observation facing a growing code base – not a timely regression.

@MrCocoDev
Copy link
Author

It does look like this section here would take progressively longer with similarly named classes:

eda9f2d#diff-ca2708dc9bf685ad676dfaa594561d8fc5c18ae39a7a5bc517049ef3fce09541R1002

@hynek
Copy link
Member

hynek commented Jun 15, 2021

OMG @euresti thank you so much – that was fast!

@hynek
Copy link
Member

hynek commented Jun 15, 2021

Am I squinting correctly that this is a caused by an edge case where a lot of classes with the same fully-qualified name are created?

Is that really the case in your prod or are we chasing a red herring here possibly?

@euresti
Copy link
Contributor

euresti commented Jun 15, 2021

But is this issue actually real? It only happens in timeit because we're creating a class inside a method in a big loop. I don't think this is super common behavior.

If you do this:

@attr.s
class MyClass(object):
    access_token = attr.ib()
    expires = attr.ib()
    refresh_token = attr.ib()
    user = attr.ib()


def create_class():
    return MyClass(
        access_token="string",
        expires=0,
        refresh_token="string",
        user="string",
    )

The problem is gone.

Or if you really want to create a different class every time you can use make_class and give each a different name:

import attr
import secrets

def create_class():
    my_class = attr.make_class(
        f"MyClass{secrets.token_hex()}",
        dict(
            access_token=attr.ib(),
            expires=attr.ib(),
            refresh_token=attr.ib(),
            user=attr.ib(),
        ),
    )

    return my_class(
        access_token="string",
        expires=0,
        refresh_token="string",
        user="string",
    )

@MrCocoDev
Copy link
Author

Am I squinting correctly that this is a caused by an edge case where a lot of classes with the same fully-qualified name are created?

Is that really the case in your prod or are we chasing a red herring here possibly?

It was really the case in our production environment until today when I finally tracked down the cause of the issue! 😅

@MrCocoDev
Copy link
Author

It seems like it would be easy enough to support this, even though I agree that it is almost always avoidable. If instead of a -1, -2, it could just be a hash of the count, or something similarly random. Presumably that is okay? Albeit ugly.

@euresti
Copy link
Contributor

euresti commented Jun 15, 2021

Possible, though you might have to deal with multithreading issues then.

@MrCocoDev
Copy link
Author

It seems like a SafeUUID (https://docs.python.org/3/library/uuid.html#uuid.SafeUUID) would fit the bill then? The same while True loop structure will work to handle the small chance of collisions.

@euresti
Copy link
Contributor

euresti commented Jun 15, 2021

I forgot to mention. The other requirement is that future runs generate the same "filename" This is to support tools that look at the stack trace.

@MrCocoDev
Copy link
Author

I was wondering if that was going to be a requirement... That does complicate the issue a little. I will think on this a little.

@MrCocoDev
Copy link
Author

So, this isn't pretty... but I think something along this line would work:

# This should probably be a calculated value to balance the weight of scanning keys
# versus looping over generated names
N = 1000  
unique_id = uuid.uuid4()
extra = ""
count = 1
file_prefix = f"<attrs generated {func_name} {module}.{cls}"

while True:
    unique_filename = f"{file_prefix}{extra}>"

    # To handle concurrency we essentially "reserve" our spot in
    # the linecache with a dummy line.  The caller can then
    # set this value correctly.
    cache_line = (1, None, (str(unique_id),), unique_filename)
    if (
        linecache.cache.setdefault(unique_filename, cache_line) == cache_line
    ):
        return unique_filename

    # Looks like this spot is taken. Try again.
    if count < N:
        count += 1
    else:
        # We have to sort to avoid potential forever loops if for example `MyClass-1` is created
        # by a joker 
        matching_file_names = sorted(
            (
                x
                for x
                in linecache.cache.keys()
                if x.startswith(file_prefix)
            ),
            key=lambda cache_key: int(cache_key[len(file_prefix) + 1:-1] or 0),
        )[-1]
        count = int(matching_file_names[len(file_prefix) + 1:-1])
        count += 1

    extra = "-{0}".format(count)

Other than running an internal cache for unique names per class, which would bloat with the more common case of many unique attr.s classes, I am not sure of how to reduce the name generation time which remains consistent across initializations.

@MrCocoDev
Copy link
Author

Perhaps the easier solution is to just warn against using attr.s classes in a local scope? It seems innocuous enough to do:

def my_function():
    @attr.s
    class ConvenientDataShape:
         thing = attr.ib()
    return ConvenientDataShape(thing="thing")

That is essentially the code that broke our system. It was written in 2017 and in all that time every engineer who looked at it must have thought it looked fine enough to leave it.

@theY4Kman
Copy link

I mean, if the requirement is only that future runs generate the same filename, and not guarding against multiple versions of attrs installed at the same time all fighting for the same spots in the line cache, why not use a counter?

_unique_filename_counts = {}


def _generate_unique_filename(cls, func_name):
    """
    Create a "filename" suitable for a function being generated.
    """
    cls_name = getattr(cls, "__qualname__", cls.__name__)
    key = (func_name, cls.__module__, cls_name)

    count = _unique_filename_counts.get(key, 0) + 1
    _unique_filename_counts[key] = count

    extra = "-{0}".format(count) if count > 1 else ""
    unique_filename = "<attrs generated {0} {1}.{2}{3}>".format(*key, extra)

    cache_line = (1, None, (str(unique_id),), unique_filename)
    linecache.cache[unique_filename] = cache_line

    return unique_filename

The current implementation appears to be akin to iterating every item in a list to calculate the length, instead of just keeping a tally every time the list changes.


Creating a new class in every function call (especially one on a hot enough path to expose this perf problem) doesn't particularly seem wise, but it also seems entirely unexpected for these two snippets

def do_it():
    class Response:
        pass

    return Response()


def do_it_attr():
    @attr.s
    class Response:
        pass

    return Response()

... to differ in time by orders of magnitude

In [2]: %timeit do_it()
3.9 µs ± 28.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [3]: %timeit do_it_attr()
The slowest run took 4.69 times longer than the fastest. This could mean that an intermediate result is being cached.
12.6 ms ± 5.47 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@euresti
Copy link
Contributor

euresti commented Jun 16, 2021

Unless I'm missing something

    count = _unique_filename_counts.get(key, 0) + 1
    _unique_filename_counts[key] = count

isn't safe if 2 threads are creating classes though.

@hynek hynek changed the title Performance degradation after creating many attr.s classes Performance degradation after creating many classes with the same name Jun 16, 2021
@hynek
Copy link
Member

hynek commented Jun 16, 2021

Just to sum a bit: we're currently using an optimistic version of handling concurrency which works great for common cases.

That leads to an O(N) performance degradation though, if many identical classes are created.

That's a very rare edge case as far as I can see.

I'm willing to merge a solution that removes that edge case degradation, as long as:

  • it remains thread-safe
  • doesn't cause a performance degradation to the common case

Would you be willing to share info why you do what you do, so we can understand your use-case better?

@MrCocoDev
Copy link
Author

MrCocoDev commented Jun 16, 2021

I originally commented this from my other Github account but for clarity's sake I'm reposting on my original account, sorry!

Would you be willing to share info why you do what you do, so we can understand your use-case better?

I shared a simplified version of what was happening in our codebase above, I'll copy it down here to for easier readability:

def my_function():
    @attr.s
    class ConvenientDataShape:
         thing = attr.ib()
    return ConvenientDataShape(thing="thing")

Whether writing code like this is good or bad, I think, is a different debate. I think the real question is: is it reasonable for a consumer of this library to assume that this is an okay thing to do? And I think it is reasonable to assume that.

I have no idea what the motivation behind the unique filename is, or what requirements on it are nice-to-haves vs must-haves, but it seems to me that thread safety is really the only one that is a must-have? Earlier it was mentioned...

The other requirement is that future runs generate the same "filename" This is to support tools that look at the stack trace.

But the current implementation doesn't do that once threads are involved, so it seems like a soft requirement. So perhaps an easy solution is to take @theY4Kman's solution but also drop in threading.current_thread().name (or its equivalent)? So that would look like:

import threading

...

_unique_filename_counts = {}

...

def _generate_unique_filename(cls, func_name):
    """
    Create a "filename" suitable for a function being generated.
    """
    thread_name = threading.current_thread().name
    cls_name = getattr(cls, "__qualname__", cls.__name__)
    key = (func_name, cls.__module__, cls_name, thread_name)

    count = _unique_filename_counts.get(key, 0) + 1
    _unique_filename_counts[key] = count

    extra = "-{0}".format(count) if count > 1 else ""
    unique_filename = "<attrs generated {0} {1}.{2}-{3}{4}>".format(*key, extra)

    cache_line = (1, None, (str(unique_id),), unique_filename)
    linecache.cache[unique_filename] = cache_line

    return unique_filename

I believe this would function almost identically to the features that the current implementation has while being thread safe and not devouring the CPU.

@Tinche
Copy link
Member

Tinche commented Jun 17, 2021

Thread names aren't guaranteed to be unique, right?

We could maybe use itertools.count as an atomic counter?

In any case, I guess this issue was motivated by efficiency concerns but creating a new class for every function invocation is very inefficient by itself (you're evaling a couple functions at least [init and repr], not to mention creating a class object), so maybe just stop doing that? 😇

@MrCocoDev
Copy link
Author

Thread names aren't guaranteed to be unique, right?

We could maybe use itertools.count as an atomic counter?

In any case, I guess this issue was motivated by efficiency concerns but creating a new class for every function invocation is very inefficient by itself (you're evaling a couple functions at least [init and repr], not to mention creating a class object), so maybe just stop doing that? 😇

I definitely agree that it is not the ideal way to write Python code, but is it so egregious that it should cripple the CPU? Once I identified the issue it was trivial to fix, but it required a lot of monitoring tools to figure out that attrs can and will eat the CPU on an innocuous mistake.

Not that it matters a terrible amount but the front page of https://www.attrs.org/en/stable/overview.html does promise "performance" and "no surprises". And as @theY4Kman mentioned neither of these code blocks look like CPU devouring code blocks:

def do_it():
    class Response:
        pass

    return Response()


def do_it_attr():
    @attr.s
    class Response:
        pass

    return Response()

@Tinche
Copy link
Member

Tinche commented Jun 17, 2021

Right, if we want performance and no surprises we should throw an exception if we see more than 10 classes with the same name. Even if we fix this particular issue, we're still using the linecache cache for generated sources. So you're actually leaking memory. Better to just blow up and make you fix it.

@hynek
Copy link
Member

hynek commented Jun 18, 2021

Blowing up is extreme, but raising a helpful warning seems appropriate.

@Tinche
Copy link
Member

Tinche commented Jun 18, 2021

Could we just hash the source code and use that? If you have a thousand identical classes, it would stand to reason they would share the linecache entry (since it'd be identical).

@MrCocoDev
Copy link
Author

Sorry for disappearing for a few days... life calls.

Would it be unreasonable to ask what the use cases are for MyClass, MyClass-2, and MyClass-N. It seems to me that any duplication here is essentially a "bad" use of the library, since it will eventually lead to performance problems. What was the driving force behind this code?

@Tinche
Copy link
Member

Tinche commented Jun 22, 2021

it's entirely reasonable to assume the odd collision in the names of classes. For example, off the top of my head I can tell you my work code base has two completely different attrs classes named User.

That said, I might have an idea of how to improve the situation.

@MrCocoDev
Copy link
Author

it's entirely reasonable to assume the odd collision in the names of classes. For example, off the top of my head I can tell you my work code base has two completely different attrs classes named User.

That said, I might have an idea of how to improve the situation.

But are they in the same module with the same class name? Just having two Users classes won't cause a collision because the naming is:

"<attrs generated {0} {1}.{2}-{3}>".format(func_name, module, class, extra)

I am pretty sure the only way to trigger this code path, outside of a local scope, is to do something like:

@attr.s
class User:  # User
    pass

Reference = User

@attr.s
class User:  # User-2
    pass

@hynek
Copy link
Member

hynek commented Jun 23, 2021

Or make_class

@hynek
Copy link
Member

hynek commented Jun 30, 2021

fixed by #828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants