-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[red-knot] Do not assume that x != 0 if x inhabits ~Literal[0]
#17370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
AlexWaygood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
| reveal_type(x == "abc") # revealed: bool | ||
| reveal_type("abc" == x) # revealed: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, this is a bit unfortunate. If x has type ~Literal[0], then you can't say for sure that it won't compare equal to 0, and you can't even say for sure that it won't compare equal to 0 if it has type int & ~Literal[0]: e.g. the runtime object False inhabits the type int & ~Literal[0], but it compares equal to 0.
The same goes if x has type ~Literal["abc"] or str & ~Literal["abc"]: you can't say anything in particular about whether x will compare equal to "abc" or not at runtime, since x could be an instance of a str subclass and still compare equal to "abc". But if x has type LiteralString & ~Literal["abc"], we can say for sure that it won't compare equal to "abc", since for x to inhabit the type LiteralString we know that it must have exactly str as its __class__ (it can't be an instance of a str subclass).
Ideally we'd find a way of fixing this issue that didn't regress on our ability to narrow from a LiteralString into a string-literal type. I suppose this may involve some special-casing in infer_binary_intersection_type_comparison.
Having said that, it feels like achieving this might be fairly complex, and our current inference is quite incorrect in some obvious cases. I'd be okay with going with what you have now if you added some # TODO comments above these reveal_type calls saying that ideally we'd be able to narrow the LiteralString type to string-literal types using == and !=. The same goes for the assertions on lines 58-59
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Am i right in saying we should be able to determine these (as Literal[False]) too?
reveal_type(x == "something else") # revealed: bool
reveal_type("something else" == x) # revealed: bool
x != 0 if x inhabits ~Literal[0]
|
@AlexWaygood currently |
| let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() { | ||
| vec![Type::object(self.db())] | ||
| } else { | ||
| intersection.iter_positive(self.db()).collect() | ||
| }; | ||
| for pos in non_empty_intersection_positive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the biggest fan of this implementation at the moment, but previously this function could return object, which seems wrong. This would happen when positive was empty. If anyone could advise on a better way to implement this that would be great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should normalize our representation of negation-only types. This is probably necessary in order to get is_equivalent_to correct, also.
I think we should ensure that intersection.positive(db) (or a method we add that we use instead of accessing that directly) always returns at least object, and never empty; that will give the correct behavior in cases like this one.
Then the remaining question is whether we normalize to that by always ensuring in IntersectionBuilder that we add object to the positive elements of every union with otherwise empty positive elements, or whether we normalize to "empty positive elements", make positive private, and add a public method that returns an iterator over just object if positive elements is empty.
The former is simpler, but perhaps more costly. Not sure if negation types will be common enough that it matters, but we do generate them commonly in narrowing. I would probably go for the former first and see if it causes a detectable regression.
It might make sense to do this as a separate PR (with some tests for equivalence of such intersections), and then rebase this PR after landing that one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're looking at this method, its implementation seems kind of wasteful, in that we might end up calling infer_binary_type_comparison twice for every positive member of the intersection. Seems like we could collapse the two loops over positive elements into one loop, with early short-circuit if possible? But that probably doesn't belong in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sounds good, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make positive private, and add a public method that returns an iterator over just object if positive elements is empty.
Lots of pre-existing code calls .iter() or .len() on positive and negative.
#[salsa::interned(debug)]
pub struct IntersectionType<'db> {
/// The intersection type includes only values in all of these types.
#[return_ref]
_positive: FxOrderSet<Type<'db>>,
/// The intersection type does not include any value in any of these types.
///
/// Negation types aren't expressible in annotations, and are most likely to arise from type
/// narrowing along with intersections (e.g. `if not isinstance(...)`), so we represent them
/// directly in intersections rather than as a separate type.
#[return_ref]
_negative: FxOrderSet<Type<'db>>,
}
impl<'db> IntersectionType<'db> {
pub fn positive(&self, db: &'db dyn Db) -> Box<dyn Iterator<Item = Type<'db>> + 'db> {
if self._positive(db).is_empty() {
Box::new(std::iter::once(Type::object(db)))
} else {
Box::new(self._positive(db).iter().copied())
}
}
pub fn negative(&self, db: &'db dyn Db) -> Box<dyn Iterator<Item = Type<'db>> + 'db> {
Box::new(self._negative(db).iter().copied())
}
pub fn positive_len(&self, db: &'db dyn Db) -> usize {
self._positive(db).len()
}
pub fn negative_len(&self, db: &'db dyn Db) -> usize {
self._negative(db).len()
}
}Do you think this is a good change as it will require lots of other updates to the codebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because
objectis so predictable in many cases (e.g. in subtyping), it's likely that in many/most cases we don't have to actually loadobjectat all in order to provide the right behavior.
yeah, I've wondered in the past if we should have a Type::Object variant to avoid having so many typeshed lookups from loading the object type. object is such a special type that it feels like it would be pretty defensible to do that. But the refactor would probably be tricky to pull off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we expect this test to pass, or should these types be bool?
def static_truthiness(not_one: Not[Literal[1]], not_int: Not[int]) -> None:
...
reveal_type(isinstance(not_int, int)) # revealed: Literal[False]
reveal_type(not isinstance(not_int, int)) # revealed: Literal[True]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those tests are correct as you've written them. The type Not[int] excludes every object that is an instance of int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then i'll have a look into why they're failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably add this to a new PR
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking into this! This stuff is so subtle...
| let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() { | ||
| vec![Type::object(self.db())] | ||
| } else { | ||
| intersection.iter_positive(self.db()).collect() | ||
| }; | ||
| for pos in non_empty_intersection_positive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should normalize our representation of negation-only types. This is probably necessary in order to get is_equivalent_to correct, also.
I think we should ensure that intersection.positive(db) (or a method we add that we use instead of accessing that directly) always returns at least object, and never empty; that will give the correct behavior in cases like this one.
Then the remaining question is whether we normalize to that by always ensuring in IntersectionBuilder that we add object to the positive elements of every union with otherwise empty positive elements, or whether we normalize to "empty positive elements", make positive private, and add a public method that returns an iterator over just object if positive elements is empty.
The former is simpler, but perhaps more costly. Not sure if negation types will be common enough that it matters, but we do generate them commonly in narrowing. I would probably go for the former first and see if it causes a detectable regression.
It might make sense to do this as a separate PR (with some tests for equivalence of such intersections), and then rebase this PR after landing that one?
crates/red_knot_python_semantic/resources/mdtest/comparison/integers.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/comparison/integers.md
Outdated
Show resolved
Hide resolved
| (ast::CmpOp::Eq, Some(Type::BooleanLiteral(true))) => { | ||
| return Ok(Type::BooleanLiteral(false)); | ||
| } | ||
| (ast::CmpOp::NotEq, Some(Type::BooleanLiteral(false))) => { | ||
| return Ok(Type::BooleanLiteral(true)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this removal is correct. I guess to address the TODO comments added above (where we intersect with LiteralString, we would need to introduce some Type method which would be true for LiteralString, which would mean "inhabitants of this type can never be equal to any object that doesn't inhabit this type, and all proper subtypes are single-valued." And then here, we could apply these removed cases only if all positive members of the intersection are such a type.
But I think maybe LiteralString is the only such type we need to care about, so a Type method might be overkill, we could also just special-case LiteralString here for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think maybe
LiteralStringis the only such type we need to care about, so aTypemethod might be overkill, we could also just special-caseLiteralStringhere for now.
No, I don't think so. bool can be safely narrowed to Literal[True] using an == True comparison. And in due course we'll want to be able to narrow <instance of enum class> to <Literal member of that enum> using equality as well (assuming all enum members have unique values, and the the enum class doesn't have a custom __eq__ method -- either case would mean such narrowing would not be sound).
| let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() { | ||
| vec![Type::object(self.db())] | ||
| } else { | ||
| intersection.iter_positive(self.db()).collect() | ||
| }; | ||
| for pos in non_empty_intersection_positive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're looking at this method, its implementation seems kind of wasteful, in that we might end up calling infer_binary_type_comparison twice for every positive member of the intersection. Seems like we could collapse the two loops over positive elements into one loop, with early short-circuit if possible? But that probably doesn't belong in this PR.
| # these are both boolean-literal types, | ||
| # since all possible runtime objects that are created by the literal syntax `1` | ||
| # are members of the type `Literal[1]` | ||
| reveal_type(not_one is not 1) # revealed: bool | ||
| reveal_type(not_one is 1) # revealed: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment doesn't seem to reflect the code immediately below it anymore. The comment says "these are both boolean-literal types", but the assertion reveals bool, not Literal[True] or Literal[False]?
bool isn't incorrect here, but it would be great if we could fix this comment to something like
| # these are both boolean-literal types, | |
| # since all possible runtime objects that are created by the literal syntax `1` | |
| # are members of the type `Literal[1]` | |
| reveal_type(not_one is not 1) # revealed: bool | |
| reveal_type(not_one is 1) # revealed: bool | |
| # TODO: `bool` is not incorrect, but these would ideally be `Literal[True]` and `Literal[False]` | |
| # respectively, since all possible runtime objects that are created by the literal syntax `1` | |
| # are members of the type `Literal[1]` | |
| reveal_type(not_one is not 1) # revealed: bool | |
| reveal_type(not_one is 1) # revealed: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (ast::CmpOp::Eq, Some(Type::BooleanLiteral(true))) => { | ||
| return Ok(Type::BooleanLiteral(false)); | ||
| } | ||
| (ast::CmpOp::NotEq, Some(Type::BooleanLiteral(false))) => { | ||
| return Ok(Type::BooleanLiteral(true)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think maybe
LiteralStringis the only such type we need to care about, so aTypemethod might be overkill, we could also just special-caseLiteralStringhere for now.
No, I don't think so. bool can be safely narrowed to Literal[True] using an == True comparison. And in due course we'll want to be able to narrow <instance of enum class> to <Literal member of that enum> using equality as well (assuming all enum members have unique values, and the the enum class doesn't have a custom __eq__ method -- either case would mean such narrowing would not be sound).
* main: (44 commits) [`airflow`] Extend `AIR311` rules (#17422) [red-knot] simplify union size limit handling (#17429) [`airflow`] Extract `AIR311` from `AIR301` rules (`AIR301`, `AIR311`) (#17310) [red-knot] set a size limit on unions of literals (#17419) [red-knot] make large-union benchmark slow again (#17418) [red-knot] optimize building large unions of literals (#17403) [red-knot] Fix comments in type_api.md (#17425) [red-knot] Do not assume that `x != 0` if `x` inhabits `~Literal[0]` (#17370) [red-knot] make large-union benchmark more challenging (#17416) [red-knot] Acknowledge that `T & anything` is assignable to `T` (#17413) Update Rust crate clap to v4.5.36 (#17381) Raise syntax error when `\` is at end of file (#17409) [red-knot] Add regression tests for narrowing constraints cycles (#17408) [red-knot] Add some knowledge of `__all__` to `*`-import machinery (#17373) Update taiki-e/install-action digest to be7c31b (#17379) Update Rust crate mimalloc to v0.1.46 (#17382) Update PyO3/maturin-action action to v1.49.1 (#17384) Update Rust crate anyhow to v1.0.98 (#17380) dependencies: switch from `chrono` to `jiff` Update Rust crate bstr to v1.12.0 (#17385) ...
Summary
Fixes incorrect negated type eq and ne assertions in infer_binary_intersection_type_comparison
fixes #17360
Test Plan
Remove and update some now incorrect tests