[red-knot] Do not assume that `x != 0` if `x` inhabits `~Literal[0]` #17370

MatthewMckee4 · 2025-04-12T23:21:24Z

Summary

Fixes incorrect negated type eq and ne assertions in infer_binary_intersection_type_comparison

fixes #17360

Test Plan

Remove and update some now incorrect tests

github-actions · 2025-04-12T23:23:30Z

`mypy_primer` results

No ecosystem changes detected ✅

AlexWaygood

Thank you!

crates/red_knot_python_semantic/resources/mdtest/type_api.md

AlexWaygood · 2025-04-13T14:51:34Z

crates/red_knot_python_semantic/resources/mdtest/comparison/intersections.md

+    reveal_type(x == "abc")  # revealed: bool
+    reveal_type("abc" == x)  # revealed: bool


ah, this is a bit unfortunate. If x has type ~Literal[0], then you can't say for sure that it won't compare equal to 0, and you can't even say for sure that it won't compare equal to 0 if it has type int & ~Literal[0]: e.g. the runtime object False inhabits the type int & ~Literal[0], but it compares equal to 0.

The same goes if x has type ~Literal["abc"] or str & ~Literal["abc"]: you can't say anything in particular about whether x will compare equal to "abc" or not at runtime, since x could be an instance of a str subclass and still compare equal to "abc". But if x has type LiteralString & ~Literal["abc"], we can say for sure that it won't compare equal to "abc", since for x to inhabit the type LiteralString we know that it must have exactly str as its __class__ (it can't be an instance of a str subclass).

Ideally we'd find a way of fixing this issue that didn't regress on our ability to narrow from a LiteralString into a string-literal type. I suppose this may involve some special-casing in infer_binary_intersection_type_comparison.

Having said that, it feels like achieving this might be fairly complex, and our current inference is quite incorrect in some obvious cases. I'd be okay with going with what you have now if you added some # TODO comments above these reveal_type calls saying that ideally we'd be able to narrow the LiteralString type to string-literal types using == and !=. The same goes for the assertions on lines 58-59

Got it. Am i right in saying we should be able to determine these (as Literal[False]) too?

reveal_type(x == "something else") # revealed: bool reveal_type("something else" == x) # revealed: bool

MatthewMckee4 · 2025-04-14T19:23:51Z

@AlexWaygood currently reveal_type(not_one is not 1) revealed bool, with my changes it now returns Literal[True]

MatthewMckee4 · 2025-04-14T19:25:23Z

crates/red_knot_python_semantic/src/types/infer.rs

+        let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() {
+            vec![Type::object(self.db())]
+        } else {
+            intersection.iter_positive(self.db()).collect()
+        };
+        for pos in non_empty_intersection_positive {


I'm not the biggest fan of this implementation at the moment, but previously this function could return object, which seems wrong. This would happen when positive was empty. If anyone could advise on a better way to implement this that would be great

I think we should normalize our representation of negation-only types. This is probably necessary in order to get is_equivalent_to correct, also.

I think we should ensure that intersection.positive(db) (or a method we add that we use instead of accessing that directly) always returns at least object, and never empty; that will give the correct behavior in cases like this one.

Then the remaining question is whether we normalize to that by always ensuring in IntersectionBuilder that we add object to the positive elements of every union with otherwise empty positive elements, or whether we normalize to "empty positive elements", make positive private, and add a public method that returns an iterator over just object if positive elements is empty.

The former is simpler, but perhaps more costly. Not sure if negation types will be common enough that it matters, but we do generate them commonly in narrowing. I would probably go for the former first and see if it causes a detectable regression.

It might make sense to do this as a separate PR (with some tests for equivalence of such intersections), and then rebase this PR after landing that one?

While we're looking at this method, its implementation seems kind of wasteful, in that we might end up calling infer_binary_type_comparison twice for every positive member of the intersection. Seems like we could collapse the two loops over positive elements into one loop, with early short-circuit if possible? But that probably doesn't belong in this PR.

Okay sounds good, thanks

make positive private, and add a public method that returns an iterator over just object if positive elements is empty.

Lots of pre-existing code calls .iter() or .len() on positive and negative.

#[salsa::interned(debug)] pub struct IntersectionType<'db> { /// The intersection type includes only values in all of these types. #[return_ref] _positive: FxOrderSet<Type<'db>>, /// The intersection type does not include any value in any of these types. /// /// Negation types aren't expressible in annotations, and are most likely to arise from type /// narrowing along with intersections (e.g. `if not isinstance(...)`), so we represent them /// directly in intersections rather than as a separate type. #[return_ref] _negative: FxOrderSet<Type<'db>>, } impl<'db> IntersectionType<'db> { pub fn positive(&self, db: &'db dyn Db) -> Box<dyn Iterator<Item = Type<'db>> + 'db> { if self._positive(db).is_empty() { Box::new(std::iter::once(Type::object(db))) } else { Box::new(self._positive(db).iter().copied()) } } pub fn negative(&self, db: &'db dyn Db) -> Box<dyn Iterator<Item = Type<'db>> + 'db> { Box::new(self._negative(db).iter().copied()) } pub fn positive_len(&self, db: &'db dyn Db) -> usize { self._positive(db).len() } pub fn negative_len(&self, db: &'db dyn Db) -> usize { self._negative(db).len() } }

Do you think this is a good change as it will require lots of other updates to the codebase

Because object is so predictable in many cases (e.g. in subtyping), it's likely that in many/most cases we don't have to actually load object at all in order to provide the right behavior.

yeah, I've wondered in the past if we should have a Type::Object variant to avoid having so many typeshed lookups from loading the object type. object is such a special type that it feels like it would be pretty defensible to do that. But the refactor would probably be tricky to pull off.

Would we expect this test to pass, or should these types be bool?

def static_truthiness(not_one: Not[Literal[1]], not_int: Not[int]) -> None: ... reveal_type(isinstance(not_int, int)) # revealed: Literal[False] reveal_type(not isinstance(not_int, int)) # revealed: Literal[True]

I think those tests are correct as you've written them. The type Not[int] excludes every object that is an instance of int.

Then i'll have a look into why they're failing

I can probably add this to a new PR

carljm

Thank you for looking into this! This stuff is so subtle...

carljm · 2025-04-14T20:25:32Z

crates/red_knot_python_semantic/src/types/infer.rs

+        let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() {
+            vec![Type::object(self.db())]
+        } else {
+            intersection.iter_positive(self.db()).collect()
+        };
+        for pos in non_empty_intersection_positive {


I think we should normalize our representation of negation-only types. This is probably necessary in order to get is_equivalent_to correct, also.

I think we should ensure that intersection.positive(db) (or a method we add that we use instead of accessing that directly) always returns at least object, and never empty; that will give the correct behavior in cases like this one.

Then the remaining question is whether we normalize to that by always ensuring in IntersectionBuilder that we add object to the positive elements of every union with otherwise empty positive elements, or whether we normalize to "empty positive elements", make positive private, and add a public method that returns an iterator over just object if positive elements is empty.

The former is simpler, but perhaps more costly. Not sure if negation types will be common enough that it matters, but we do generate them commonly in narrowing. I would probably go for the former first and see if it causes a detectable regression.

It might make sense to do this as a separate PR (with some tests for equivalence of such intersections), and then rebase this PR after landing that one?

crates/red_knot_python_semantic/resources/mdtest/comparison/integers.md

crates/red_knot_python_semantic/src/types/infer.rs

crates/red_knot_python_semantic/resources/mdtest/comparison/integers.md

carljm · 2025-04-14T21:11:18Z

crates/red_knot_python_semantic/src/types/infer.rs

-                (ast::CmpOp::Eq, Some(Type::BooleanLiteral(true))) => {
-                    return Ok(Type::BooleanLiteral(false));
-                }
-                (ast::CmpOp::NotEq, Some(Type::BooleanLiteral(false))) => {
-                    return Ok(Type::BooleanLiteral(true));
-                }


Yeah, I think this removal is correct. I guess to address the TODO comments added above (where we intersect with LiteralString, we would need to introduce some Type method which would be true for LiteralString, which would mean "inhabitants of this type can never be equal to any object that doesn't inhabit this type, and all proper subtypes are single-valued." And then here, we could apply these removed cases only if all positive members of the intersection are such a type.

But I think maybe LiteralString is the only such type we need to care about, so a Type method might be overkill, we could also just special-case LiteralString here for now.

But I think maybe LiteralString is the only such type we need to care about, so a Type method might be overkill, we could also just special-case LiteralString here for now.

No, I don't think so. bool can be safely narrowed to Literal[True] using an == True comparison. And in due course we'll want to be able to narrow <instance of enum class> to <Literal member of that enum> using equality as well (assuming all enum members have unique values, and the the enum class doesn't have a custom __eq__ method -- either case would mean such narrowing would not be sound).

carljm · 2025-04-14T21:13:05Z

crates/red_knot_python_semantic/src/types/infer.rs

+        let non_empty_intersection_positive = if intersection.positive(self.db()).is_empty() {
+            vec![Type::object(self.db())]
+        } else {
+            intersection.iter_positive(self.db()).collect()
+        };
+        for pos in non_empty_intersection_positive {


While we're looking at this method, its implementation seems kind of wasteful, in that we might end up calling infer_binary_type_comparison twice for every positive member of the intersection. Seems like we could collapse the two loops over positive elements into one loop, with early short-circuit if possible? But that probably doesn't belong in this PR.

…infer

AlexWaygood · 2025-04-16T10:32:53Z

crates/red_knot_python_semantic/resources/mdtest/type_api.md

+    # these are both boolean-literal types,
+    # since all possible runtime objects that are created by the literal syntax `1`
+    # are members of the type `Literal[1]`
+    reveal_type(not_one is not 1)  # revealed: bool
+    reveal_type(not_one is 1)  # revealed: bool


this comment doesn't seem to reflect the code immediately below it anymore. The comment says "these are both boolean-literal types", but the assertion reveals bool, not Literal[True] or Literal[False]?

bool isn't incorrect here, but it would be great if we could fix this comment to something like

Suggested change

# these are both boolean-literal types,

# since all possible runtime objects that are created by the literal syntax `1`

# are members of the type `Literal[1]`

reveal_type(not_one is not 1) # revealed: bool

reveal_type(not_one is 1) # revealed: bool

# TODO: `bool` is not incorrect, but these would ideally be `Literal[True]` and `Literal[False]`

# respectively, since all possible runtime objects that are created by the literal syntax `1`

# are members of the type `Literal[1]`

reveal_type(not_one is not 1) # revealed: bool

reveal_type(not_one is 1) # revealed: bool

AlexWaygood · 2025-04-16T10:38:13Z

crates/red_knot_python_semantic/src/types/infer.rs

-                (ast::CmpOp::Eq, Some(Type::BooleanLiteral(true))) => {
-                    return Ok(Type::BooleanLiteral(false));
-                }
-                (ast::CmpOp::NotEq, Some(Type::BooleanLiteral(false))) => {
-                    return Ok(Type::BooleanLiteral(true));
-                }


But I think maybe LiteralString is the only such type we need to care about, so a Type method might be overkill, we could also just special-case LiteralString here for now.

No, I don't think so. bool can be safely narrowed to Literal[True] using an == True comparison. And in due course we'll want to be able to narrow <instance of enum class> to <Literal member of that enum> using equality as well (assuming all enum members have unique values, and the the enum class doesn't have a custom __eq__ method -- either case would mean such narrowing would not be sound).

…17370) ## Summary Fixes incorrect negated type eq and ne assertions in infer_binary_intersection_type_comparison fixes #17360 ## Test Plan Remove and update some now incorrect tests

* main: (44 commits) [`airflow`] Extend `AIR311` rules (#17422) [red-knot] simplify union size limit handling (#17429) [`airflow`] Extract `AIR311` from `AIR301` rules (`AIR301`, `AIR311`) (#17310) [red-knot] set a size limit on unions of literals (#17419) [red-knot] make large-union benchmark slow again (#17418) [red-knot] optimize building large unions of literals (#17403) [red-knot] Fix comments in type_api.md (#17425) [red-knot] Do not assume that `x != 0` if `x` inhabits `~Literal[0]` (#17370) [red-knot] make large-union benchmark more challenging (#17416) [red-knot] Acknowledge that `T & anything` is assignable to `T` (#17413) Update Rust crate clap to v4.5.36 (#17381) Raise syntax error when `\` is at end of file (#17409) [red-knot] Add regression tests for narrowing constraints cycles (#17408) [red-knot] Add some knowledge of `__all__` to `*`-import machinery (#17373) Update taiki-e/install-action digest to be7c31b (#17379) Update Rust crate mimalloc to v0.1.46 (#17382) Update PyO3/maturin-action action to v1.49.1 (#17384) Update Rust crate anyhow to v1.0.98 (#17380) dependencies: switch from `chrono` to `jiff` Update Rust crate bstr to v1.12.0 (#17385) ...

Fix eq static assert

5711a33

MatthewMckee4 marked this pull request as ready for review April 12, 2025 23:21

MatthewMckee4 requested review from AlexWaygood, carljm, dcreager and sharkdp as code owners April 12, 2025 23:21

Update more tests

0d06017

MatthewMckee4 changed the title ~~Fix eq static assert~~ [red-knot] Fix eq static assert Apr 13, 2025

AlexWaygood added the ty Multi-file analysis & type inference label Apr 13, 2025

AlexWaygood reviewed Apr 13, 2025

View reviewed changes

AlexWaygood changed the title ~~[red-knot] Fix eq static assert~~ [red-knot] Do not assume that x != 0 if x inhabits ~Literal[0] Apr 13, 2025

MatthewMckee4 added 3 commits April 13, 2025 22:59

Add back test and add TODO for reveal type on a bool expression

19c2d1d

Fix todo message

0b96c20

Ensure infer_binary_intersection_type_comparison always returns bool

eee6d34

MatthewMckee4 commented Apr 14, 2025

View reviewed changes

Merge branch 'main' into fix-eq-static-assert

ee1b632

carljm reviewed Apr 14, 2025

View reviewed changes

MatthewMckee4 added 4 commits April 14, 2025 22:19

Fix incorrect change

221eb2e

Add comment to explain int is

2d390df

Add boolean literal type as default for intersection bool expression …

5575f30

…infer

Revert test

f3d4eeb

MatthewMckee4 requested a review from carljm April 15, 2025 23:30

carljm approved these changes Apr 16, 2025

View reviewed changes

carljm merged commit a2a7b1e into astral-sh:main Apr 16, 2025
23 checks passed

AlexWaygood reviewed Apr 16, 2025

View reviewed changes

MatthewMckee4 mentioned this pull request Apr 16, 2025

[red-knot] Fix comments in type_api.md #17425

Merged

MatthewMckee4 deleted the fix-eq-static-assert branch April 19, 2025 13:59

		reveal_type(x == "abc") # revealed: bool
		reveal_type("abc" == x) # revealed: bool

[red-knot] Do not assume that x != 0 if x inhabits ~Literal[0] #17370

[red-knot] Do not assume that x != 0 if x inhabits ~Literal[0] #17370

Uh oh!

Conversation

MatthewMckee4 commented Apr 12, 2025

Summary

Test Plan

Uh oh!

github-actions bot commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

AlexWaygood left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MatthewMckee4 commented Apr 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carljm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexWaygood Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlexWaygood Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[red-knot] Do not assume that `x != 0` if `x` inhabits `~Literal[0]` #17370

[red-knot] Do not assume that `x != 0` if `x` inhabits `~Literal[0]` #17370

github-actions bot commented Apr 12, 2025 •

edited

Loading

`mypy_primer` results

AlexWaygood Apr 16, 2025 •

edited

Loading

AlexWaygood Apr 16, 2025 •

edited

Loading

AlexWaygood Apr 16, 2025 •

edited

Loading