Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Consider all definitions after terminal statements unreachable #15676

Merged
merged 37 commits into from
Jan 29, 2025

Conversation

dcreager
Copy link
Member

@dcreager dcreager commented Jan 22, 2025

FlowSnapshot now tracks a reachable bool, which indicates whether we have encountered a terminal statement on that control flow path. When merging flow states together, we skip any that have been marked unreachable. This ensures that bindings that can only be reached through unreachable paths are not considered visible.

Test Plan

The new mdtests failed (with incorrect reveal_type results, and spurious possibly-unresolved-reference errors) before adding the new visibility constraints.

@dcreager dcreager added the red-knot Multi-file analysis & type inference label Jan 22, 2025
Copy link
Contributor

github-actions bot commented Jan 22, 2025

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+5 -5 violations, +0 -0 fixes in 2 projects; 53 projects unchanged)

bokeh/bokeh (+4 -4 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --no-preview --select ALL

+ examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block

zulip/zulip (+1 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --no-preview --select ALL

+ scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f"{shard}.{external_host}"` instead of `if`-`else`-block
- scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f'{shard}.{external_host}'` instead of `if`-`else`-block

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
SIM108 10 5 5 0 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+5 -5 violations, +0 -0 fixes in 2 projects; 53 projects unchanged)

bokeh/bokeh (+4 -4 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
+ examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block
- examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block

zulip/zulip (+1 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

+ scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f"{shard}.{external_host}"` instead of `if`-`else`-block
- scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f'{shard}.{external_host}'` instead of `if`-`else`-block

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
SIM108 10 5 5 0 0

@dcreager
Copy link
Member Author

There are a couple of new diagnostics in the benchmark that don't look correct to me. I need to see if I can minimize that into an mdtest to diagnose.

@dcreager dcreager marked this pull request as draft January 22, 2025 22:07
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic!! Love to see a feature that is easier than anticipated :)

* main:
  [red-knot] MDTests: Do not depend on precise public-symbol type inference (#15691)
  [red-knot] Make `infer.rs` unit tests independent of public symbol inference (#15690)
  Tidy knot CLI tests (#15685)
  [red-knot] Port comprehension tests to Markdown (#15688)
  Create Unknown rule diagnostics with a source range (#15648)
  [red-knot] Port 'deferred annotations' unit tests to Markdown (#15686)
  [red-knot] Support custom typeshed Markdown tests (#15683)
  Don't run the linter ecosystem check on PRs that only touch red-knot crates (#15687)
  Add `rules` table to configuration (#15645)
  [red-knot] Make `Diagnostic::file` optional (#15640)
  [red-knot] Add test for nested attribute access (#15684)
  [red-knot] Anchor relative paths in configurations (#15634)
  [`pyupgrade`] Handle multiple base classes for PEP 695 generics (`UP046`) (#15659)
  [`pyflakes`] Treat arguments passed to the `default=` parameter of `TypeVar` as type expressions (`F821`) (#15679)
  Upgrade zizmor to the latest version in CI (#15649)
  [`pyupgrade`] Add rules to use PEP 695 generics in classes and functions (`UP046`, `UP047`) (#15565)
  [red-knot] Ensure a gradual type can always be assigned to itself (#15675)
@dylwil3
Copy link
Collaborator

dylwil3 commented Jan 23, 2025

I don't think you need to change anything for this PR, but just so it's on your radar: try/finally knows how to ruin any clean story. For example, the following test fails on this branch:

def f():
    x = 1
    while True:
        try:
            break
        finally:
            x = 2
    reveal_type(x)  # revealed: Literal[2] 

(it gives a revealed type of Literal[1] instead)

@carljm
Copy link
Contributor

carljm commented Jan 23, 2025

Yeah, we can handle finally in this PR or as a separate follow up PR, but it probably will need some special handling.

dcreager and others added 9 commits January 23, 2025 21:14
Co-authored-by: Carl Meyer <carl@astral.sh>
* main:
  Add `check` command (#15692)
  [red-knot] Use itertools to clean up `SymbolState::merge` (#15702)
  [red-knot] Add `--ignore`, `--warn`, and `--error` CLI arguments (#15689)
  Use `uv init --lib` in tutorial (#15718)
  [red-knot] Use `Unknown | T_inferred` for undeclared public symbols (#15674)
  [`ruff`] Parenthesize fix when argument spans multiple lines for `unnecessary-round` (`RUF057`) (#15703)
  [red-knot] Rename `TestDbBuilder::typeshed` to `.custom_typeshed` (#15712)
  Honor banned top level imports by TID253 in PLC0415.  (#15628)
  Apply `AIR302`-context check only in `@task` function (#15711)
  [`airflow`] Update `AIR302` to check for deprecated context keys (#15144)
  Remove test rules from JSON schema (#15627)
  Add two missing commits to changelog (#15701)
  Fix grep for version number in docker build (#15699)
  Bump version to 0.9.3 (#15698)
  Preserve raw string prefix and escapes (#15694)
  [`flake8-pytest-style`] Rewrite references to `.exception` (`PT027`) (#15680)
Copy link
Member Author

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the following test fails on this branch:

Thanks @dylwil3! I added that as a failing test case. I'm going to poke at it briefly to see if it's easy to add for this PR

* main:
  Run `cargo update` (#15769)
  [red-knot] Document public symbol type inferece (#15766)
  Update dawidd6/action-download-artifact action to v8 (#15760)
  Update NPM Development dependencies (#15758)
  Update pre-commit dependencies (#15756)
  Update dependency ruff to v0.9.3 (#15755)
  Update dependency mdformat-mkdocs to v4.1.2 (#15754)
  Update Rust crate uuid to v1.12.1 (#15753)
  Update Rust crate unicode-ident to v1.0.15 (#15752)
  Fix docstring in ruff_annotate_snippets (#15748)
  Update Rust crate insta to v1.42.1 (#15751)
  Update Rust crate clap to v4.5.27 (#15750)
  Add references to `trio.run_process` and `anyio.run_process` (#15761)
  [`ruff`] Do not emit diagnostic when all arguments to `zip()` are variadic (`RUF058`) (#15744)
  [red-knot] Ensure differently ordered unions are considered equivalent when they appear inside tuples inside top-level intersections (#15743)
  [red-knot] Ensure differently ordered unions and intersections are understood as equivalent even inside arbitrarily nested tuples (#15740)
  [red-knot] Promote the `all_type_pairs_are_assignable_to_their_union` property test to stable (#15739)
  [`pylint`] Do not trigger `PLR6201` on empty collections (#15732)
  Improve the file watching failure error message (#15728)
  Speed symbol state merging back up (#15731)
@dcreager dcreager marked this pull request as ready for review January 27, 2025 21:25
@carljm
Copy link
Contributor

carljm commented Jan 27, 2025

We can use the new statically known branches feature to address

Nit: update the PR description to match how the PR currently works.

Comment on lines 210 to 216
## Early returns and list comprehensions

```py
def f(x: str) -> int:
y = [x for i in range(len(x))]
return 4
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A return at the end of the function isn't really an "early return", contrary to the test title.

Is this test testing some code that was added in this PR? It doesn't clearly seem to test anything about terminality of return.

This test intersects with two known-incorrect areas (closed-over vars in scopes with return statements, modeling of eagerly-executing nested scopes), has no reveal_type (so asserts nothing more than "no diagnostics here") and doesn't demonstrate any TODOs. This makes me question its value as a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minimal reproduction of an error I was getting in the tomllib benchmark test. I thought to put it here to catch it earlier in the CI process, but since it's redundant with the tomllib test I'll remove it. (Maybe a better way to handle this is to move the assertions out of the benchmark and into a new test case that also analyzes tomllib? That way the benchmark is only concerned with performance, and the test with correctness.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine (even good) to take cases that we find from tomllib (or any other testing on real code), distill them down to their simplest form that illustrates a potential regression, and include them as mdtests. So that's not an issue. I think my question here really is trying to understand what the regression was (what did we do wrong in this example in some earlier version of this PR?) and clarify what behavior the test is trying to demonstrate (maybe just with some prose, maybe by adding a reveal_type, maybe both).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was using visibility constraints, this was a regression because:

  • the return statement would mark the x parameter as non-visible for the remainder of the flow;
  • list comprehensions would resolve free variables as of the end of the containing scope,
  • which is technically after the return statement, and so the body of the list comprehension wouldn't see the x formal parameter as a visible definition.

It's the a lack of an unresolved-reference error that shows that the regression isn't there anymore.

Talking through it in detail like this, I think this is superfluous with the "Early returns and nested functions" tests, because it was the "closed-over vars in scopes with return statements" part that was relevant, and the "modeling of eagerly-executing nested scopes" was a red herring.

Comment on lines +698 to +707
// Unreachable snapshots should not be merged: If the current snapshot is unreachable, it
// should be completely overwritten by the snapshot we're merging in. If the other snapshot
// is unreachable, we should return without merging.
if !snapshot.reachable {
return;
}
if !self.reachable {
self.restore(snapshot);
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the possibility that neither self nor snapshot is reachable. This code will correctly result in self still being marked unreachable in that case, but it seems a little odd that we keep the visible definitions state from self in that case. The logical extension of the idea that an unreachable state takes no part in a merge should be that in case neither are reachable, we reset self to a state with reachable: false and no visible definitions, right?

Not sure if it practically makes a difference, though; since the new state is still unreachable its visible definitions shouldn't "go" anywhere anyway, even if self is later merged into another state. I guess it will make a difference to the types we reveal in the following unreachable code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it seems a little odd that we keep the visible definitions state from self in that case

If we want this, I think it would be best to add an invariant that marking a flow as unreachable clears out all of its definitions, and update mark_unreachable and restore to maintain that invariant. Then this code in merge would do the right thing as you describe.

I guess it will make a difference to the types we reveal in the following unreachable code?

Yes, that's exactly right. (In the sense that that's what the code does, not necessarily that that's what we want it to do 😅) For now, I was punting on this, because this PR isn't currently addressing what we want to do for unreachable code. (Note that in the mdtests I've tried to not put in any reveal_types in unreachable positions.)

I think there are a couple of issues at play here. One is that, not even considering the merge, what do we want to report in the unreachable code within the same block after a terminal statement?

x = 2
return
reveal_type(x)  # ???

Should it be an unresolved-reference error? Or should it act as if the terminal statement weren't there, and show what x would be if control flow could somehow make it to that point? Or should we silence all diagnostics completely in unreachable code?

Whatever we choose, we should have the same result for

if cond:
    x = 2
    return
else:
    x = 3
    return
reveal_type(x)  # ???

If we decide that we want the first case to reveal Literal[2], then we'd want this case to reveal Literal[2, 3] — which means that we actually do want to merge all of the flows, even if they're unreachable, and it's just the visibility of the relevant symbols that needs to be tracked/adjusted somehow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that all makes sense, thanks for clarifying!

I think we should tackle this separately as a later problem; maybe file an issue for it? I don't think it's urgent, and I'm happy with doing the "least work" for now, even if it's less consistent (that is, neither eagerly clearing definitions when a branch becomes unreachable, nor merging definitions from two unreachable branches just so we can have fully consistent types in unreachable code).

Whatever we do for unreachable code should look consistent whether the origin of that unreachability is in terminals or in statically-known branches (that is, code under an if False should behave similarly to code after a return), which may place some additional constraints on how we handle it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-of-head thoughts on what behaviors we do/don't want (not for action now, just for consideration in writing up the issue):

  • I definitely don't think it would be useful to issue undefined-reference diagnostics for names used in unreachable code that would have been defined were the branch reachable.
  • In some sense I think Never is the "right" type for all expressions in unreachable code?
  • But I suspect that the most useful behavior is to check the unreachable code (and still raise diagnostics in it as normal), as if it were reachable.
  • We should look into mypy and pyright behavior here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mypy has terrible UX if it sees a reveal_type in a block of code it considers unreachable: it just doesn't emit any diagnostic for the reveal_type call at all. This has been the source of many bug reports at mypy over the years, because the rule that tells you off for having unreachable code in the first place is disabled by default, even if you opt into mypy's --strict flag. We shouldn't do what mypy does!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crates/ruff_benchmark/benches/red_knot.rs Show resolved Hide resolved
@sharkdp
Copy link
Contributor

sharkdp commented Jan 28, 2025

I haven't followed the full conversation on this topic, so maybe I have missed this being discussed somewhere. I'm also not sure if this is out-of-scope for this PR or out-of-scope in general, but I was curious how the interplay between statically-known branches and terminal statements worked, and it looks like this is something that this approach does not handle (yet)?

def _(cond: bool):
    x = "a"
    if cond:
        x = "b"
        if True:
            return

    reveal_type(x)  # revealed: "a", "b"; should be "a"

I understand that this is probably difficult to handle with our "delayed" handling of statically-known branches, but it seems worth to mention as a limitation, because pyright can handle situations like this.

Copy link
Member Author

@dcreager dcreager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I was curious how the interplay between statically-known branches and terminal statements worked

That's a good example @sharkdp! Before I was also adding a ~AlwaysTrue visibility constraint when we encountered a terminal statement, which (edit: I think) would handle your example. I removed it because it seemed to be interacting incorrectly with continue and break. (The new visibility constraint should apply to the rest of the current flow, but should not apply when we jump back to the beginning of the loop.) But @carljm and I convinced each other in Discord that the issue with continue is that we haven't implemented the jump back to top-of-loop yet (pending fixpoint support in salsa) — and I think that would solve the visibility constraint issue too...

Comment on lines 210 to 216
## Early returns and list comprehensions

```py
def f(x: str) -> int:
y = [x for i in range(len(x))]
return 4
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minimal reproduction of an error I was getting in the tomllib benchmark test. I thought to put it here to catch it earlier in the CI process, but since it's redundant with the tomllib test I'll remove it. (Maybe a better way to handle this is to move the assertions out of the benchmark and into a new test case that also analyzes tomllib? That way the benchmark is only concerned with performance, and the test with correctness.)

Comment on lines +698 to +707
// Unreachable snapshots should not be merged: If the current snapshot is unreachable, it
// should be completely overwritten by the snapshot we're merging in. If the other snapshot
// is unreachable, we should return without merging.
if !snapshot.reachable {
return;
}
if !self.reachable {
self.restore(snapshot);
return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it seems a little odd that we keep the visible definitions state from self in that case

If we want this, I think it would be best to add an invariant that marking a flow as unreachable clears out all of its definitions, and update mark_unreachable and restore to maintain that invariant. Then this code in merge would do the right thing as you describe.

I guess it will make a difference to the types we reveal in the following unreachable code?

Yes, that's exactly right. (In the sense that that's what the code does, not necessarily that that's what we want it to do 😅) For now, I was punting on this, because this PR isn't currently addressing what we want to do for unreachable code. (Note that in the mdtests I've tried to not put in any reveal_types in unreachable positions.)

I think there are a couple of issues at play here. One is that, not even considering the merge, what do we want to report in the unreachable code within the same block after a terminal statement?

x = 2
return
reveal_type(x)  # ???

Should it be an unresolved-reference error? Or should it act as if the terminal statement weren't there, and show what x would be if control flow could somehow make it to that point? Or should we silence all diagnostics completely in unreachable code?

Whatever we choose, we should have the same result for

if cond:
    x = 2
    return
else:
    x = 3
    return
reveal_type(x)  # ???

If we decide that we want the first case to reveal Literal[2], then we'd want this case to reveal Literal[2, 3] — which means that we actually do want to merge all of the flows, even if they're unreachable, and it's just the visibility of the relevant symbols that needs to be tracked/adjusted somehow.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the new tests, but the behavior looks good for this PR!

reveal_type(x) # revealed: Literal["else"]
reveal_type(x) # revealed: Literal["else"]
except ValueError:
# TODO: Literal["raise"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this TODO is accurate, since reveal_type is a call, and I don't think we'd special-case it to assume it can't raise? So at the very least "else" is a possible value here.

I think it's accurate to say that "before" is not possible here, but only if we understand that boolean-testing a value of type bool is a special case that doesn't execute a __bool__ method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this TODO is accurate, since reveal_type is a call, and I don't think we'd special-case it to assume it can't raise? So at the very least "else" is a possible value here.

Ah, I was actually thinking we would try to do that somehow! But per above, that deserves discussion, so I'll back out the assumption that we'd try to do that.

I think it's accurate to say that "before" is not possible here, but only if we understand that boolean-testing a value of type bool is a special case that doesn't execute a __bool__ method.

I removed the TODO entirely, leaving "before" as a potential possibility here too, so that we're not making any assumptions about how we might make exception tracking less approximate.

Comment on lines 381 to 382
TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The
false positives in this section are because of that, and not our terminal statement support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT from the TODOs below, it looks like the only problem you're referring to here is our over-approximation of the possible location where an exception could be raised. Let's describe this a bit more clearly, to save future us from wondering what we meant here. (Also re-wording to avoid making it specific to raise statements, since it's really about too many jumps from places that aren't raise statements at all, and to avoid describing it as incorrect, since technically (given KeyboardInterrupt) the current behavior is correct, just likely not preferable.

Suggested change
TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The
false positives in this section are because of that, and not our terminal statement support.
Currently we assume that an exception could be raised anywhere within a `try` block; the TODOs below reflect
cases where we could implement a more precise understanding of where exceptions (barring `KeyboardInterrupt`
and `MemoryError`) can and cannot actually be raised.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of points, though the tl;dr is that I like your edit:

  • The part about the jump behavior was less about "exceptions can come from anywhere" and more about "a raise definitely doesn't execute the else clause". The latter should be something we can model regardless of how approximate our exception tracking is. But we're actually giving the correct result below in the else reveal_type, so you're right that this isn't accurately a TODO! That said, I think it's coincidence that we're giving the correct result in the else clause — "raise" isn't included because we're treating raise the same as return, not because we know that raise skips the else clause. (And "raise" is included in the except clauses not because we know the raise statement jumps there — with this PR we think the raise skips everything since it's terminal! — but because our approximation thinks an unrelated exception might occur just after the assignment.)

  • I had written this (and the TODOs below) describing the goal of a less approximate exception tracking strategy. But that deserves discussion about what we'd want that to look like, so I like your suggestion to describe this in terms of what we're currently doing instead.

Copy link
Contributor

@carljm carljm Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the first bullet point here is something I hadn't fully understood! It sort of seems like the current "right behavior by accident" might suffice until/unless we implement tighter understandings of where exceptions can be raised, at which point we might also need better understanding of what raise actually does. Certainly wouldn't object to adding some text to record this context for future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried to write up an edit describing this, and I kind of ended up concluding that we may never need to implement any special understanding of where raise can jump to? Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding raise as terminal, as we do already in this PR, and 2) still understanding raise as "a point where an exception can be raised", as we do in this PR.

(2) seems unlikely to be something we'd miss in that future where we're adding more understanding of exception points, so I'm thinking maybe we don't need to document this any more than it is already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding raise as terminal, as we do already in this PR, and 2) still understanding raise as "a point where an exception can be raised", as we do in this PR.

Ah yes, that sounds right!

so I'm thinking maybe we don't need to document this any more than it is already.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still understanding raise as "a point where an exception can be raised", as we do in this PR.

The only place where this might fall over is that a raise can only raise an exception, whereas a call can but doesn't have to raise one. So calls could jump to except or else, while raise could only jump to except.

No, wait! Calls can jump to except or flow through to the next statement, and the end of the try block flows to else. So yes, you're right, raise being terminal within the try block would correctly encode that it can't "jump" to else. (Nothing actually "jumps" there, in fact.)

dcreager and others added 4 commits January 29, 2025 09:32
Co-authored-by: Carl Meyer <carl@astral.sh>
* main:
  [red-knot] Extend instance-attribute tests (#15808)
  Fix formatter warning message for `flake8-quotes` option (#15788)
  [`flake8-bugbear`] Exempt `NewType` calls where the original type is immutable (`B008`) (#15765)
  Add missing config docstrings (#15803)
  [`refurb`] Do not emit diagnostic when loop variables are used outside loop body (`FURB122`) (#15757)
  [`ruff`] Check for shadowed `map` before suggesting fix (`RUF058`) (#15790)
  [red-knot] Do not use explicit `knot_extensions.Unknown` declaration (#15787)
  Preserve quotes in generated byte strings (#15778)
  [minor] Simplify some `ExprStringLiteral` creation logic (#15775)
  Preserve quote style in generated code (#15726)
  Rename internal helper functions (#15771)
  [`airflow`] Extend airflow context parameter check for `BaseOperator.execute` (`AIR302`) (#15713)
  Implement tab autocomplete for `ruff config` (#15603)
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor edit, but looks land-ready to me!

Comment on lines 381 to 382
TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The
false positives in this section are because of that, and not our terminal statement support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried to write up an edit describing this, and I kind of ended up concluding that we may never need to implement any special understanding of where raise can jump to? Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding raise as terminal, as we do already in this PR, and 2) still understanding raise as "a point where an exception can be raised", as we do in this PR.

(2) seems unlikely to be something we'd miss in that future where we're adding more understanding of exception points, so I'm thinking maybe we don't need to document this any more than it is already.

…ements.md

Co-authored-by: Carl Meyer <carl@astral.sh>
@dcreager dcreager merged commit 15d886a into main Jan 29, 2025
21 checks passed
@dcreager dcreager deleted the dcreager/terminal-visibility branch January 29, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants