Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvement for evolve function #1145

Open
LukasKrocek opened this issue Jun 9, 2023 · 3 comments
Open

Performance Improvement for evolve function #1145

LukasKrocek opened this issue Jun 9, 2023 · 3 comments

Comments

@LukasKrocek
Copy link

LukasKrocek commented Jun 9, 2023

Hello,

I have been recently working with the evolve function in attrs and noticed some performance issues which I believe could be improved.

Here is an example I tried:

@frozen
class A:
    a: int
    b: str

a = A(1, "1")

print(timeit.timeit("evolve(a, a=2)", globals=globals()))  # 0.5601 seconds
print(timeit.timeit("A(2, a.b)", globals=globals()))  # 0.2004 seconds

The evolve function in this case appears to be almost three times slower than creating a new instance of the class manually.

Upon investigation, I discovered that a significant amount of time was spent on creating class Fields, iterating them, and updating unchanged values.

Additionally, I noticed that creating instances using kwargs took approximately 30% more time than using args:

print(timeit.timeit("A(a=2, b=a.b)", globals=globals())) # 0.2635 seconds

Given these findings, I suggest that we could improve the performance of the evolve function by generating per class functions the first time that the class is evolved. These functions would look something like this:

def evolve_A(inst, changes):
    return cls(
            changes.get("a", inst.a),
            changes.get("b", inst.b)
        )

I am open to creating a PR that would implement these changes if you think this is a good idea. I look forward to your thoughts on this.

Thank you.

@LukasKrocek
Copy link
Author

It would look something like this:

class EvolveRegistryFunction(Protocol):
    def __call__(self, cls: type[T], inst: T, changes: dict[str, Any]) -> T:
        ...


_EVOLVE_REGISTRY: dict[type[Any], EvolveRegistryFunction] = {}


def generate_evolve_func(cls: type[Any]):
    attrs: Sequence[Attribute] = fields(cls)
    args = {}
    kwargs = {}
    for a in attrs:
        if not a.init:
            continue
        attr_name = a.name  # To deal with private attributes.
        init_name = a.alias

        if a.kw_only:
            kwargs[init_name] = attr_name
        else:
            args[init_name] = attr_name

    fn_name = f"evolve_{cls.__name__}"
    code = [f"def {fn_name}(cls, inst, changes):"]  # TODO: add module?
    code.append("    return cls(")

    for init_name, attr_name in args.items():
        code.append(f"    changes.get('{init_name}', inst.{attr_name}),")

    for init_name, attr_name in kwargs.items():
        code.append(f"    {init_name}=changes.get('{init_name}', inst.{attr_name}),")

    code.append("    )")
    script = "\n".join(code)
    file_name = _generate_unique_filename(cls, "evolve")
    globs = {}
    eval(compile(script, file_name, "exec"), globs)
    _EVOLVE_REGISTRY[cls] = globs[fn_name]
    return globs[fn_name]


def evolve_2(inst, **changes):
    cls = inst.__class__
    evolve_func = _EVOLVE_REGISTRY.get(cls) or generate_evolve_func(cls)
    return evolve_func(cls, inst, changes)```

@jamesmurphy-mc
Copy link
Contributor

A note on kwargs performance, this is a general fact about Python, not an attrs-specific performance degradation. Calling any Callable using positional args will be faster than passing the same args as kwargs.

As for the evolve registry, another alternative is to just stick another method on the class as an _evolve or __attrs_evolve__ function.

@hynek
Copy link
Member

hynek commented Dec 15, 2024

Sorry for the late answer, this came up while I was afh (away from home ;)) for a prolonged time and I've been reeling ever since.

I think this is interesting since we've just implemented __replace__ for Python 3.13 in #1383.

As Tin writes, I don't think it's worth to do code gen because that would slow down class generation further, but maybe we could do some optimizations for the 3.13+ case? Currently, __replace__ uses evolve internally, but maybe we could flip that for 3.13+ with a smarter per-class approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants