Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup typechecking of nested if expressions #12700

Merged
merged 1 commit into from
May 20, 2022

Conversation

huguesb
Copy link
Contributor

@huguesb huguesb commented Apr 30, 2022

Deeply nested if/else expressions have a worst-case exponential behavior.

This will for instance manifest when returning literal values which cause
repeated analysis of conditional branches with subtly different type context
for each literal.

This can be optimized by observing that a simple literal context will yield
the same analysis as its fallback type, and likewise, two literals of the
same fallback type will yield the same analysis. In those case we can avoid
the repeated analysis and prevent the worst-case exponential behavior.

Fixes #9591

@github-actions

This comment has been minimized.

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I don't 100% understand the change, but it passes tests and primer and fixes the perf issue.

mypy/checkexpr.py Outdated Show resolved Hide resolved
@@ -3910,6 +3916,15 @@ def visit_conditional_expr(self, e: ConditionalExpr, allow_none_return: bool = F
if_type = self.analyze_cond_branch(if_map, e.if_expr, context=else_type,
allow_none_return=allow_none_return)

elif if_type == ctx or (if_lit is not None and (if_lit == ctx or if_lit == ctx_lit)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if_type == ctx seems clearly fine, but I'm not sure I 100% understand the second half of the check.

I agree that join will turn literals back into fallbacks, but it's not obvious to me that the only effect of using the fallback type as context would be to turn a returned literal type into its fallback.

I'm also worried / optimistic that a change along the lines of #12395 might breaking assumptions here / make this change unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't quite understand why this second call to analyze_cond_branch is here at all... It can't think of a case where it is relevant to us the type of the result of the if branch to infer the type of the result of the else branch. But then again I don't fully understand the semantics of the "type context"...

From what I understand though, it seems silly to pass a literal type as type context to analyze_cond_branch and it would probably make more sense to always pass the fallback type instead, which would simplify things back to a simple if_type == ctx (after generalizing if_type and ctx to their respective fallback types if either is a literal).

Maybe someone with a deeper understanding of this logic can chime in? @JukkaL maybe ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with a new approach that feels more theoretically sound and more robust to any change similar to #12395

@github-actions

This comment has been minimized.

@huguesb huguesb force-pushed the pr-fix-slow-nested-if-literal branch from 84d38f5 to 1f0741b Compare May 1, 2022 04:28
@huguesb huguesb requested a review from hauntsaninja May 1, 2022 04:29
@github-actions

This comment has been minimized.

Deeply nested if/else expressions have a worst-case exponential behavior.

This will for instance manifest when returning literal values which cause
repeated analysis of conditional branches with subtly different type context
for each literal.

This can be optimized by observing that a simple literal context will yield
the same analysis as its fallback type, and likewise, two literals of the
same fallback type will yield the same analysis. In those case we can avoid
the repeated analysis and prevent the worst-case exponential behavior.

Fixes python#9591
@huguesb huguesb force-pushed the pr-fix-slow-nested-if-literal branch from 1f0741b to 2c0209e Compare May 16, 2022 05:58
@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@huguesb huguesb mentioned this pull request May 18, 2022
Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good to me. I ran some extra tests and couldn't detect any type checker differences from not passing literal context

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the exponential behavior! Looks good.

@JukkaL JukkaL merged commit e93be73 into python:master May 20, 2022
JukkaL pushed a commit that referenced this pull request May 20, 2022
Deeply nested if/else expressions have a worst-case exponential behavior.

This will for instance manifest when returning literal values which cause
repeated analysis of conditional branches with subtly different type context
for each literal.

This can be optimized by observing that a simple literal context will yield
the same analysis as its fallback type, and likewise, two literals of the
same fallback type will yield the same analysis. In those case we can avoid
the repeated analysis and prevent the worst-case exponential behavior.

Fixes #9591
@huguesb huguesb deleted the pr-fix-slow-nested-if-literal branch June 10, 2022 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mypy appears to become inefficient for nested ifs (13+ nested ifs)
4 participants