Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Consider all definitions after terminal statements unreachable #15676

Merged
merged 37 commits into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
e0d2130
Add failing test cases for terminal control flow
dcreager Jan 22, 2025
f45af24
Record unreachable visibility after terminal statements
dcreager Jan 22, 2025
70b681b
More thorough reveal_types
dcreager Jan 22, 2025
81a3164
Actually we do support those now!
dcreager Jan 22, 2025
93b149c
Merge branch 'main' into dcreager/terminal-visibility
dcreager Jan 23, 2025
912ed62
Use happy constant
dcreager Jan 23, 2025
4102c89
Record reachability in semantic index builder
dcreager Jan 24, 2025
60eccc9
Apply suggestions from code review
dcreager Jan 24, 2025
4190b5b
Add (failing) eager subscope example
dcreager Jan 24, 2025
641ccb8
Also skip merging when lhs is unreachable
dcreager Jan 24, 2025
efdf6c3
We support this one too
dcreager Jan 24, 2025
57ea7ca
Add failing try/finally example
dcreager Jan 24, 2025
db98926
Use for loops
dcreager Jan 24, 2025
a434762
Add Carl's suggested return error
dcreager Jan 24, 2025
b9f91fb
Merge branch 'main' into dcreager/terminal-visibility
dcreager Jan 24, 2025
1388eb0
Merge branch 'main' into dcreager/terminal-visibility
dcreager Jan 27, 2025
79de3eb
Add failing early return example
dcreager Jan 27, 2025
c4d1599
Add TODOs for function return example
dcreager Jan 27, 2025
82e6e67
Add TODO link for continue false positives
dcreager Jan 27, 2025
bfe76da
Prefer the faster option first
dcreager Jan 27, 2025
0b7da66
Apply suggestions from code review
dcreager Jan 28, 2025
e4771bf
Better descriptions of continue and break; add nested tests
dcreager Jan 28, 2025
9431fb9
Remove tomllib benchmark error reproduction
dcreager Jan 28, 2025
08b0522
Add "both nested branches" tests
dcreager Jan 28, 2025
ee3ac28
Add separate then/else branch tests
dcreager Jan 28, 2025
add8a57
Fix comment
dcreager Jan 28, 2025
a0e6980
Remove duplicate test
dcreager Jan 28, 2025
9fb8d2a
Fix header
dcreager Jan 28, 2025
3684739
Add raise tests
dcreager Jan 28, 2025
7fbec5f
Add return-in-try tests
dcreager Jan 28, 2025
5c45e88
Fix comment
dcreager Jan 28, 2025
d2b21fc
Add statically-known terminal test case
dcreager Jan 28, 2025
2f62eb0
Apply suggestions from code review
dcreager Jan 29, 2025
32e8131
Update TODOs to not assume how we'll change exception tracking
dcreager Jan 29, 2025
2932bb4
Merge branch 'main' into dcreager/terminal-visibility
dcreager Jan 29, 2025
1cf65f5
Linter
dcreager Jan 29, 2025
681fae2
Update crates/red_knot_python_semantic/resources/mdtest/terminal_stat…
dcreager Jan 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Terminal statements

## Introduction

Terminal statements complicate a naive control-flow analysis.

As a simple example:

```py
def f(cond: bool) -> str:
if cond:
x = "test"
else:
raise ValueError
return x

def g(cond: bool):
if cond:
x = "test"
reveal_type(x) # revealed: Literal["test"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
carljm marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError
reveal_type(x) # revealed: Literal["test"]
```

In `f`, we should be able to determine that the `else` branch ends in a terminal statement, and that
the `return` statement can only be executed when the condition is true. We should therefore consider
the reference always bound, even though `x` is only bound in the true branch.

Similarly, in `g`, we should see that the assignment of the value `"unreachable"` can never be seen
by the final `reveal_type`.

## `return` is terminal

```py
def f(cond: bool) -> str:
if cond:
x = "test"
else:
return "early"
return x # no possibly-unresolved-reference diagnostic!

def g(cond: bool):
if cond:
x = "test"
reveal_type(x) # revealed: Literal["test"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
carljm marked this conversation as resolved.
Show resolved Hide resolved
return
reveal_type(x) # revealed: Literal["test"]
```

## `continue` is terminal within its loop scope
carljm marked this conversation as resolved.
Show resolved Hide resolved

```py
def f(cond: bool) -> str:
while True:
if cond:
x = "test"
else:
continue
return x

def g(cond: bool, i: int):
x = "before"
for _ in range(i):
if cond:
x = "loop"
reveal_type(x) # revealed: Literal["loop"]
else:
x = "continue"
reveal_type(x) # revealed: Literal["continue"]
continue
reveal_type(x) # revealed: Literal["loop"]
reveal_type(x) # revealed: Literal["before", "loop"]
dcreager marked this conversation as resolved.
Show resolved Hide resolved
carljm marked this conversation as resolved.
Show resolved Hide resolved
```

## `break` is terminal within its loop scope
carljm marked this conversation as resolved.
Show resolved Hide resolved

```py
def f(cond: bool) -> str:
while True:
if cond:
x = "test"
else:
break
return x
return x # error: [unresolved-reference]

def g(cond: bool, i: int):
x = "before"
for _ in range(i):
if cond:
x = "loop"
reveal_type(x) # revealed: Literal["loop"]
else:
x = "break"
reveal_type(x) # revealed: Literal["break"]
break
reveal_type(x) # revealed: Literal["loop"]
reveal_type(x) # revealed: Literal["before", "loop", "break"]
```

## `return` is terminal in nested conditionals
carljm marked this conversation as resolved.
Show resolved Hide resolved

```py
def f(cond1: bool, cond2: bool) -> str:
if cond1:
if cond2:
x = "test1"
else:
return "early"
else:
x = "test2"
return x

def g(cond1: bool, cond2: bool):
if cond1:
if cond2:
x = "test1"
reveal_type(x) # revealed: Literal["test1"]
else:
x = "unreachable"
reveal_type(x) # revealed: Literal["unreachable"]
return
reveal_type(x) # revealed: Literal["test1"]
else:
x = "test2"
reveal_type(x) # revealed: Literal["test2"]
reveal_type(x) # revealed: Literal["test1", "test2"]
```

## Terminal in a `finally` block
carljm marked this conversation as resolved.
Show resolved Hide resolved
dcreager marked this conversation as resolved.
Show resolved Hide resolved

Control-flow through finally isn't working right yet:
dcreager marked this conversation as resolved.
Show resolved Hide resolved

```py
def f():
x = 1
while True:
try:
break
finally:
x = 2
# TODO: should be Literal[2]
reveal_type(x) # revealed: Literal[1]
```

## Terminal statement after a list comprehension

```py
def f(x: str) -> int:
y = [x for i in range(len(x))]
return 4
```
26 changes: 21 additions & 5 deletions crates/red_knot_python_semantic/src/semantic_index/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,12 @@ impl<'db> SemanticIndexBuilder<'db> {
.record_visibility_constraint(VisibilityConstraint::VisibleIf(constraint))
}

/// Records that all remaining statements in the current block are unreachable, and therefore
/// not visible.
fn mark_unreachable(&mut self) {
self.current_use_def_map_mut().mark_unreachable();
}

/// Records a [`VisibilityConstraint::Ambiguous`] constraint.
fn record_ambiguous_visibility(&mut self) -> ScopedVisibilityConstraintId {
self.current_use_def_map_mut()
Expand Down Expand Up @@ -1019,11 +1025,6 @@ where
}
self.visit_body(body);
}
ast::Stmt::Break(_) => {
if self.loop_state().is_inside() {
self.loop_break_states.push(self.flow_snapshot());
}
}

ast::Stmt::For(
for_stmt @ ast::StmtFor {
Expand Down Expand Up @@ -1270,6 +1271,21 @@ where
// - https://github.com/astral-sh/ruff/pull/13633#discussion_r1788626702
self.visit_body(finalbody);
}

ast::Stmt::Raise(_) | ast::Stmt::Return(_) | ast::Stmt::Continue(_) => {
walk_stmt(self, stmt);
// Everything in the current block after a terminal statement is unreachable.
self.mark_unreachable();
}

ast::Stmt::Break(_) => {
if self.loop_state().is_inside() {
self.loop_break_states.push(self.flow_snapshot());
}
// Everything in the current block after a terminal statement is unreachable.
self.mark_unreachable();
}

_ => {
walk_stmt(self, stmt);
}
Expand Down
25 changes: 25 additions & 0 deletions crates/red_knot_python_semantic/src/semantic_index/use_def.rs
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,7 @@ impl std::iter::FusedIterator for DeclarationsIterator<'_, '_> {}
pub(super) struct FlowSnapshot {
symbol_states: IndexVec<ScopedSymbolId, SymbolState>,
scope_start_visibility: ScopedVisibilityConstraintId,
reachable: bool,
}

#[derive(Debug)]
Expand Down Expand Up @@ -503,6 +504,8 @@ pub(super) struct UseDefMapBuilder<'db> {

/// Currently live bindings and declarations for each symbol.
symbol_states: IndexVec<ScopedSymbolId, SymbolState>,

reachable: bool,
}

impl Default for UseDefMapBuilder<'_> {
Expand All @@ -515,11 +518,16 @@ impl Default for UseDefMapBuilder<'_> {
bindings_by_use: IndexVec::new(),
definitions_by_definition: FxHashMap::default(),
symbol_states: IndexVec::new(),
reachable: true,
}
}
}

impl<'db> UseDefMapBuilder<'db> {
pub(super) fn mark_unreachable(&mut self) {
self.reachable = false;
}

pub(super) fn add_symbol(&mut self, symbol: ScopedSymbolId) {
let new_symbol = self
.symbol_states
Expand Down Expand Up @@ -656,6 +664,7 @@ impl<'db> UseDefMapBuilder<'db> {
FlowSnapshot {
symbol_states: self.symbol_states.clone(),
scope_start_visibility: self.scope_start_visibility,
reachable: self.reachable,
}
}

Expand All @@ -678,12 +687,25 @@ impl<'db> UseDefMapBuilder<'db> {
num_symbols,
SymbolState::undefined(self.scope_start_visibility),
);

self.reachable = snapshot.reachable;
}

/// Merge the given snapshot into the current state, reflecting that we might have taken either
/// path to get here. The new state for each symbol should include definitions from both the
/// prior state and the snapshot.
pub(super) fn merge(&mut self, snapshot: FlowSnapshot) {
// Unreachable snapshots should not be merged: If the current snapshot is unreachable, it
// should be completely overwritten by the snapshot we're merging in. If the other snapshot
// is unreachable, we should return without merging.
if !self.reachable {
self.restore(snapshot);
return;
}
if !snapshot.reachable {
return;
}

// We never remove symbols from `symbol_states` (it's an IndexVec, and the symbol
// IDs must line up), so the current number of known symbols must always be equal to or
// greater than the number of known symbols in a previously-taken snapshot.
Expand All @@ -705,6 +727,9 @@ impl<'db> UseDefMapBuilder<'db> {
self.scope_start_visibility = self
.visibility_constraints
.add_or_constraint(self.scope_start_visibility, snapshot.scope_start_visibility);

// At least one of the two snapshots was reachable, so the merged result is too.
carljm marked this conversation as resolved.
Show resolved Hide resolved
self.reachable = true;
}

pub(super) fn finish(mut self) -> UseDefMap<'db> {
Expand Down
17 changes: 0 additions & 17 deletions crates/ruff_benchmark/benches/red_knot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,25 +26,8 @@ const TOMLLIB_312_URL: &str = "https://raw.githubusercontent.com/python/cpython/
static EXPECTED_DIAGNOSTICS: &[&str] = &[
// We don't support `*` imports yet:
"error[lint:unresolved-import] /src/tomllib/_parser.py:7:29 Module `collections.abc` has no member `Iterable`",
// We don't support terminal statements in control flow yet:
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:66:18 Name `s` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:98:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:101:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:104:14 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:115:14 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:126:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:348:20 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:353:5 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:453:24 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:455:9 Name `nest` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:482:16 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:566:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:573:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:579:12 Name `char` used when possibly not defined",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:580:63 Name `char` used when possibly not defined",
carljm marked this conversation as resolved.
Show resolved Hide resolved
// We don't handle intersections in `is_assignable_to` yet
"error[lint:invalid-argument-type] /src/tomllib/_parser.py:626:46 Object of type `Unknown & ~AlwaysFalsy | @Todo & ~AlwaysFalsy` cannot be assigned to parameter 1 (`match`) of function `match_to_datetime`; expected type `Match`",
"warning[lint:possibly-unresolved-reference] /src/tomllib/_parser.py:629:38 Name `datetime_obj` used when possibly not defined",
"error[lint:invalid-argument-type] /src/tomllib/_parser.py:632:58 Object of type `Unknown & ~AlwaysFalsy | @Todo & ~AlwaysFalsy` cannot be assigned to parameter 1 (`match`) of function `match_to_localtime`; expected type `Match`",
"error[lint:invalid-argument-type] /src/tomllib/_parser.py:639:52 Object of type `Unknown & ~AlwaysFalsy | @Todo & ~AlwaysFalsy` cannot be assigned to parameter 1 (`match`) of function `match_to_number`; expected type `Match`",
"warning[lint:unused-ignore-comment] /src/tomllib/_parser.py:682:31 Unused blanket `type: ignore` directive",
Expand Down
Loading