Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown in >=0.981 #14867

Closed
JMMarchant opened this issue Mar 10, 2023 · 15 comments
Closed

Slowdown in >=0.981 #14867

JMMarchant opened this issue Mar 10, 2023 · 15 comments
Labels
bug mypy got something wrong performance

Comments

@JMMarchant
Copy link

JMMarchant commented Mar 10, 2023

Bug Report

There has been a sizeable slowdown (33x) in areas of our codebase in recent versions of mypy (since 0.981); using the latest version I was finally able to track down and profile the culprits to create a minimal example which seems to be related to Union typechecking.

To Reproduce

from __future__ import annotations

from typing import Dict, Union, Mapping, Protocol, Any, Optional, List

import numpy as np

class _TensorLike(Protocol):
    """Protocol defining what methods and operations a Generic Tensor can perform."""

    def __mul__(self: _TensorLike, other: Any) -> _TensorLike:
        """Multiplies the self tensor with the other tensor and returns the result."""
        ...

    def __sub__(self: _TensorLike, other: Any) -> _TensorLike:
        """Subtracts the other tensor from the self tensor and returns the result."""
        ...

    def squeeze(self: _TensorLike, axis: Optional[Any] = None) -> _TensorLike:
        """Returns a tensor with all the dimensions of input of size 1 removed."""
        ...

_Weights = Mapping[str, _TensorLike]

class Test:
    _other_worker_shares: List[int]

    def _encode_secret(self, secret: Union[_TensorLike, np.ndarray]) -> np.ndarray:
        return np.ones(5)

    def _reconstruct_secret(self, shares: List[Union[np.ndarray, int]]) -> np.ndarray:
        return np.ones(5)

    def _encode_and_reconstruct_state_dict(
        self, secret_state_dict: Union[_Weights, Mapping[str, np.ndarray]]
    ) -> Dict[str, np.ndarray]:
        """Encodes and reconstructs state dict from own and worker shares.

        Encrypts `secret_state_dict` using `own_shares` then reconstructs it using
        `self._other_worker_shares`.
        """
        encrypted_state_dict = [
            self._encode_secret(v) for _, v in secret_state_dict.items()
        ]
        reconstructed = [
            self._reconstruct_secret([*self._other_worker_shares, value])
            for value in encrypted_state_dict
        ]
        x = list(secret_state_dict)
        z = zip(x, reconstructed)
        d = dict(z)
        return d

Expected Behavior

Timings between versions are substantially different:

  • 0.971 = 3.042s
  • 1.1.1 = 1m44.30s

Per-line timings give:

    7    106.4
    8     10.9
   11      2.0
   12      1.9
   15      1.4
   16      1.0
   18     10.4
   19      4.8
   20      2.1
   22     58.1
   24     46.5
   25      2.0
   28    186.1
   31     83.3
   36      2.0
   41 78880690.2
   44    416.6
   48 8945003.5
   49    322.2
   50    458.1
   51      1.3

Given this (fairly) minimal example in which no typechecking errors are found I would expect there to not be such a slowdown.

Is there a paradigm shift or issue that would explain this and how can we go about making the typechecker happier and faster?

Actual Behavior

Substantial slowdown after 0.971

Your Environment

  • Mypy version used: 0.971/1.1.1
  • Mypy command-line flags: --line-checking-stats for 1.1.1 run
  • Mypy configuration options from mypy.ini (and other config files): None
  • Python version used: 3.8.16
  • numpy: 1.23.5 (though same behaviour in 1.24.2)
@JMMarchant JMMarchant added the bug mypy got something wrong label Mar 10, 2023
@ikonst
Copy link
Contributor

ikonst commented Mar 11, 2023

I tried narrowing it down more:

from __future__ import annotations

from typing import Any
from typing import Protocol

import numpy as np


class _TensorLike(Protocol):
    # Operator must be __add_, __sub__, __mul__.
    # Does not reproduce with binary operators (__add__, __xor__ etc.)
    def __mul__(self, other: Any) -> _TensorLike:
        ...


def f(arg: dict[str, _TensorLike] | dict[str, np.ndarray]) -> None:
    arg.get('foo')  # or most other dict methods -- SLOW LINE HERE

@ikonst
Copy link
Contributor

ikonst commented Mar 11, 2023

See reproduction without numpy.

  • The trigger appears to be the 10 overloads that ndarray.__mul__ has, each with a self type, so I'm replicating this.
  • The fact that B is a protocol (not class) is significant.
from __future__ import annotations

from typing import Any
from typing import Generic
from typing import Protocol
from typing import TypeVar
from typing import overload


class V1:
    pass


class V2:
    pass


class V3:
    pass


class V4:
    pass


class V5:
    pass


class V6:
    pass


class V7:
    pass


class V8:
    pass


class V9:
    pass


class V10:
    pass


T = TypeVar("T")


class A(Generic[T]):
    @overload
    def foo(self: A[V1], other: V1) -> A[V1]:
        return A()

    @overload
    def foo(self: A[V2], other: V2) -> A[V2]:
        return A()

    @overload
    def foo(self: A[V3], other: V3) -> A[V3]:
        return A()

    @overload
    def foo(self: A[V4], other: V4) -> A[V4]:
        return A()

    @overload
    def foo(self: A[V5], other: V5) -> A[V5]:
        return A()

    @overload
    def foo(self: A[V6], other: V6) -> A[V6]:
        return A()

    @overload
    def foo(self: A[V7], other: V7) -> A[V7]:
        return A()

    # @overload
    # def foo(self: A[V8], other: V8) -> A[V8]:
    #     return A()
    #
    # @overload
    # def foo(self: A[V9], other: V9) -> A[V9]:
    #     return A()
    #
    # @overload
    # def foo(self: A[V10], other: V10) -> A[V10]:
    #     return A()

    def foo(self, other: Any) -> Any:
        return A()


class B(Protocol):
    def foo(self, other: Any) -> B:
        ...


x: A | B
x.__repr__()

@AlexWaygood
Copy link
Member

AlexWaygood commented Mar 11, 2023

I used git bisect with @ikonst's benchmark, and d27bff6 seems to be the "first bad commit" here. Mypy @ d27bff6 appears to be significantly slower when checking this snippet than mypy @ d2063d2. Cc. @ilevkivskyi.

(I bisected with non-compiled mypy btw, so my results might not be 100% accurate I suppose.)

@JMMarchant
Copy link
Author

I used git bisect with @ikonst's benchmark, and d27bff6 seems to be the "first bad commit" here. Mypy @ d27bff6 appears to be significantly slower when checking this snippet than mypy @ d2063d2. Cc. @ilevkivskyi.

(I bisected with non-compiled mypy btw, so my results might not be 100% accurate I suppose.)

That matches with the previous commit I bisected to when trying to find the "bad" one; wasn't able to identify what was slower at that point though

@ilevkivskyi
Copy link
Member

Hm, just to understand this better. What is the performance of something like

x: B = A()

(using the code by @ikonst) before and after the bad commit? IIUC this should be very slow even before the bad commit.

@AlexWaygood
Copy link
Member

AlexWaygood commented Mar 11, 2023

Hm, just to understand this better. What is the performance of something like

x: B = A()

(using the code by @ikonst) before and after the bad commit? IIUC this should be very slow even before the bad commit.

Yup, the time taken on my machine checking that is virtually identical between the two commits. (It seems to be much faster checking that code than it is checking x: A | B, apparently).

@ilevkivskyi
Copy link
Member

Hm, interesting. Before that commit is_proper_subtype() (inconsistently) used some old simplistic logic for overloads, while the commit made it use the full/correct subtyping logic that was written in is_subtype(). So I thought this exposed bad performance in a corner case to more scenarios. Can you please also check what is the performance of

y: A | B
x: B = y

before vs after (and vs the original example by @ikonst)?

@AlexWaygood
Copy link
Member

AlexWaygood commented Mar 11, 2023

Can you please also check what is the performance of

y: A | B
x: B = y

before vs after (and vs the original example by @ikonst)?

That might be very slightly slower between the two commits, but still nothing like the dramatic slowdown on @ikonst's benchmark (which appears to get around 3x slower between the two commits).

@ikonst
Copy link
Contributor

ikonst commented Mar 11, 2023

I think it gets exponentially slower with more overrides.

@ilevkivskyi
Copy link
Member

OK, I figured it out, it indeed exposes pre-existing bad performance to more cases. Consider same example as @ikonst but with

class B(Protocol):
    def foo(self, other: object) -> B:  # Note object here
        ...

y: A | B
x: B = y

this is extremely slow even before 0.981, and the bad commit simply exposes the problem to is_proper_subtype() as well. I spent some time thinking about how to avoid the exponential growth here, but don't have any ideas yet.

@ilevkivskyi
Copy link
Member

OK, well, there is a solution: negative subtype cache (tested it locally, works very fast for this case), but IIRC that was dangerous for some reason, @JukkaL what do you think?

@ilevkivskyi
Copy link
Member

Also negative cache doesn't seem to cause any slow down on self-check. Here I opened a PR if someone wants to play with it #14884. cc @JMMarchant

@JMMarchant
Copy link
Author

Tested this on my toy example and it's substantially better than 1.1.1 (12.313s). Still not as fast as 0.971 but much improved.

JukkaL pushed a commit that referenced this issue Apr 11, 2023
A possible solution for #14867 (I
just copy everything from positive caches).
@ilevkivskyi
Copy link
Member

@JMMarchant could you please check the current master? If the perf is acceptable, you can close the issue.

@JMMarchant
Copy link
Author

Current master works well enough; there's an approximately 3x slowdown on a cold start (3min vs 1min), a 10x slowdown on a warm start (20s vs 2s) but this is substantially better than before this fix!

Thank you very much for all your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong performance
Projects
None yet
Development

No branches or pull requests

5 participants