-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] now that inference sees nested bindings, allow unresolved globals #19821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ty] now that inference sees nested bindings, allow unresolved globals #19821
Conversation
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
|
|
||
| Python allows `global` statements in function bodies to add new variables to the global scope, but | ||
| we require a matching global binding or declaration. We lint on unresolved `global` statements, and | ||
| we don't include the symbols they might define in `*` imports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could change this (CPython does include these globals in * imports), but we'd need to make ExportFinder walk function and class bodies, which it currently doesn't. (And doesn't want to?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPython does include these globals in
*imports
Well, maybe -- it does if the function that sets the global has been executed before the import occurs, otherwise it doesn't. So I guess we could consider it a "possibly bound" symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think we should do this with our current capabilities. For code like this, which works fine at runtime, we'll still emit an unresolved-reference diagnostic, and I think that's confusing without the unresolved-global diagnostic:
def f():
global x
x = 42
f()
print(x)On this branch, we report:
error[unresolved-reference]: Name `x` used when not defined
--> foo.py:6:7
|
5 | f()
6 | print(x)
| ^
|
info: rule `unresolved-reference` is enabled by default
Similarly, as you pointed out, the query in re_exports still won't take account of any global symbols in nested scopes -- it deliberately doesn't recurse into subscopes when attempting to find symbols that might be imported by a * import.
Changing either of those pieces of behaviour is something we could consider doing, but each would be somewhat involved and would have tradeoffs to consider. I don't think we should remove the unresolved-global lint unless we do both of them, however.
|
That makes sense, and I think it's reasonable to close this as a won't-fix-for-now-at-least. But now you've got me thinking about how we currently define the "public type". Right now we have two notions, as I understand them:
What if there was a third notion?
Then maybe when inferring the local type (i.e. calling def f():
global x
x = 42
reveal_type(x) # Unknown (unbound)
f()
reveal_type(x) # Unknown (unbound) <-- false positive
if secrets.randbelow(2):
x = 1
reveal_type(x) # Literal[1] (possibly unbound) <-- incomplete
x = 2
reveal_type(x) # Literal[2]
f()
reveal_type(x) # Literal[2] <-- incorrectBut if we wanted to, I think we could achieve this without any new analysis: def f():
global x
x = 42
reveal_type(x) # Literal[42] (possibly unbound) <-- false negative
f()
reveal_type(x) # Literal[42] (possibly unbound)
if secrets.randbelow(2):
x = 1
reveal_type(x) # Literal[42, 1] (possibly unbound)
x = 2
reveal_type(x) # Literal[2]
f()
reveal_type(x) # Literal[2] <-- incorrectWould that be worth pursuing? My default reaction is actually no (this is making things more complicated, and I'm not sure how common it is to see code that benefits from any of this), but let me know what you think. |
|
I think we probably should fix the incorrect results in the above examples (not necessarily right now, but at some point). I'm not that tempted by the exact change proposed above because it doesn't achieve that. (It's not clear to me why the nonlocal writes should be considered if the name is locally unbound, but not otherwise.) The simplest (but imprecise) way to do that is to union all inferred types of names that have nonlocal bindings with a possible binding of The other approach (which would not be that hard; you were going to do it in the previous PR until I said not to bother) would be to always include nonlocal bindings as part of the possible type for any use of a local name. This is not wrong, but I'm worried about false positives if we just do that much without more precision. We could perhaps improve this by considering the creation of any nested scope with any nonlocal writes to a symbol to be a conditional write to that symbol in the local scope's control flow. That would mean we would only consider those nonlocal writes in places in control flow that are reachable from the definition of the nested scope. I think this is probably the best answer, but it definitely makes it a bit more complicated to implement. I don't think we should prioritize any of this right now; I'd rather have our priorities driven by real-world bug reports. But I will open an issue to track it. |
|
Opened astral-sh/ty#958 |
This is an optional follow-up PR on top of #19820. Now that we track all bindings from nested scopes, it could be practical to allow
globalvariables without an explicit definition in the module scope. For example:Before:
After: