Skip to content

Commit ad1a8da

Browse files
authored
[red-knot] Check for invalid overload usages (#17609)
## Summary Part of #15383, this PR adds the core infrastructure to check for invalid overloads and adds a diagnostic to raise if there are < 2 overloads for a given definition. ### Design notes The requirements to check the overloads are: * Requires `FunctionType` which has the `to_overloaded` method * The `FunctionType` **should** be for the function that is either the implementation or the last overload if the implementation doesn't exists * Avoid checking any `FunctionType` that are part of an overload chain * Consider visibility constraints This required a couple of iteration to make sure all of the above requirements are fulfilled. #### 1. Use a set to deduplicate The logic would first collect all the `FunctionType` that are part of the overload chain except for the implementation or the last overload if the implementation doesn't exists. Then, when iterating over all the function declarations within the scope, we'd avoid checking these functions. But, this approach would fail to consider visibility constraints as certain overloads _can_ be behind a version check. Those aren't part of the overload chain but those aren't a separate overload chain either. <details><summary>Implementation:</summary> <p> ```rs fn check_overloaded_functions(&mut self) { let function_definitions = || { self.types .declarations .iter() .filter_map(|(definition, ty)| { // Filter out function literals that result from anything other than a function // definition e.g., imports. if let DefinitionKind::Function(function) = definition.kind(self.db()) { ty.inner_type() .into_function_literal() .map(|ty| (ty, definition.symbol(self.db()), function.node())) } else { None } }) }; // A set of all the functions that are part of an overloaded function definition except for // the implementation function and the last overload in case the implementation doesn't // exists. This allows us to collect all the function definitions that needs to be skipped // when checking for invalid overload usages. let mut overloads: HashSet<FunctionType<'db>> = HashSet::default(); for (function, _) in function_definitions() { let Some(overloaded) = function.to_overloaded(self.db()) else { continue; }; if overloaded.implementation.is_some() { overloads.extend(overloaded.overloads.iter().copied()); } else if let Some((_, previous_overloads)) = overloaded.overloads.split_last() { overloads.extend(previous_overloads.iter().copied()); } } for (function, function_node) in function_definitions() { let Some(overloaded) = function.to_overloaded(self.db()) else { continue; }; if overloads.contains(&function) { continue; } // At this point, the `function` variable is either the implementation function or the // last overloaded function if the implementation doesn't exists. if overloaded.overloads.len() < 2 { if let Some(builder) = self .context .report_lint(&INVALID_OVERLOAD, &function_node.name) { let mut diagnostic = builder.into_diagnostic(format_args!( "Function `{}` requires at least two overloads", &function_node.name )); if let Some(first_overload) = overloaded.overloads.first() { diagnostic.annotate( self.context .secondary(first_overload.focus_range(self.db())) .message(format_args!("Only one overload defined here")), ); } } } } } ``` </p> </details> #### 2. Define a `predecessor` query The `predecessor` query would return the previous `FunctionType` for the given `FunctionType` i.e., the current logic would be extracted to be a query instead. This could then be used to make sure that we're checking the entire overload chain once. The way this would've been implemented is to have a `to_overloaded` implementation which would take the root of the overload chain instead of the leaf. But, this would require updates to the use-def map to somehow be able to return the _following_ functions for a given definition. #### 3. Create a successor link This is what Pyrefly uses, we'd create a forward link between two functions that are involved in an overload chain. This means that for a given function, we can get the successor function. This could be used to find the _leaf_ of the overload chain which can then be used with the `to_overloaded` method to get the entire overload chain. But, this would also require updating the use-def map to be able to "see" the _following_ function. ### Implementation This leads us to the final implementation that this PR implements which is to consider the overloaded functions using: * Collect all the **function symbols** that are defined **and** called within the same file. This could potentially be an overloaded function * Use the public bindings to get the leaf of the overload chain and use that to get the entire overload chain via `to_overloaded` and perform the check This has a limitation that in case a function redefines an overload, then that overload will not be checked. For example: ```py from typing import overload @overload def f() -> None: ... @overload def f(x: int) -> int: ... # The above overload will not be checked as the below function with the same name # shadows it def f(*args: int) -> int: ... ``` ## Test Plan Update existing mdtest and add snapshot diagnostics.
1 parent 0861ecf commit ad1a8da

File tree

6 files changed

+283
-7
lines changed

6 files changed

+283
-7
lines changed

crates/red_knot_python_semantic/resources/mdtest/overloads.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,18 +309,29 @@ reveal_type(func("")) # revealed: Literal[""]
309309

310310
### At least two overloads
311311

312+
<!-- snapshot-diagnostics -->
313+
312314
At least two `@overload`-decorated definitions must be present.
313315

314316
```py
315317
from typing import overload
316318

317-
# TODO: error
318319
@overload
319320
def func(x: int) -> int: ...
321+
322+
# error: [invalid-overload]
320323
def func(x: int | str) -> int | str:
321324
return x
322325
```
323326

327+
```pyi
328+
from typing import overload
329+
330+
@overload
331+
# error: [invalid-overload]
332+
def func(x: int) -> int: ...
333+
```
334+
324335
### Overload without an implementation
325336

326337
#### Regular modules
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
source: crates/red_knot_test/src/lib.rs
3+
expression: snapshot
4+
---
5+
---
6+
mdtest name: overloads.md - Overloads - Invalid - At least two overloads
7+
mdtest path: crates/red_knot_python_semantic/resources/mdtest/overloads.md
8+
---
9+
10+
# Python source files
11+
12+
## mdtest_snippet.py
13+
14+
```
15+
1 | from typing import overload
16+
2 |
17+
3 | @overload
18+
4 | def func(x: int) -> int: ...
19+
5 |
20+
6 | # error: [invalid-overload]
21+
7 | def func(x: int | str) -> int | str:
22+
8 | return x
23+
```
24+
25+
## mdtest_snippet.pyi
26+
27+
```
28+
1 | from typing import overload
29+
2 |
30+
3 | @overload
31+
4 | # error: [invalid-overload]
32+
5 | def func(x: int) -> int: ...
33+
```
34+
35+
# Diagnostics
36+
37+
```
38+
error: lint:invalid-overload: Overloaded function `func` requires at least two overloads
39+
--> src/mdtest_snippet.py:4:5
40+
|
41+
3 | @overload
42+
4 | def func(x: int) -> int: ...
43+
| ---- Only one overload defined here
44+
5 |
45+
6 | # error: [invalid-overload]
46+
7 | def func(x: int | str) -> int | str:
47+
| ^^^^
48+
8 | return x
49+
|
50+
51+
```
52+
53+
```
54+
error: lint:invalid-overload: Overloaded function `func` requires at least two overloads
55+
--> src/mdtest_snippet.pyi:5:5
56+
|
57+
3 | @overload
58+
4 | # error: [invalid-overload]
59+
5 | def func(x: int) -> int: ...
60+
| ----
61+
| |
62+
| Only one overload defined here
63+
|
64+
65+
```

crates/red_knot_python_semantic/src/types.rs

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6525,6 +6525,13 @@ pub struct FunctionType<'db> {
65256525

65266526
#[salsa::tracked]
65276527
impl<'db> FunctionType<'db> {
6528+
/// Returns the [`File`] in which this function is defined.
6529+
pub(crate) fn file(self, db: &'db dyn Db) -> File {
6530+
// NOTE: Do not use `self.definition(db).file(db)` here, as that could create a
6531+
// cross-module dependency on the full AST.
6532+
self.body_scope(db).file(db)
6533+
}
6534+
65286535
pub(crate) fn has_known_decorator(self, db: &dyn Db, decorator: FunctionDecorators) -> bool {
65296536
self.decorators(db).contains(decorator)
65306537
}
@@ -6546,21 +6553,41 @@ impl<'db> FunctionType<'db> {
65466553
Type::BoundMethod(BoundMethodType::new(db, self, self_instance))
65476554
}
65486555

6556+
/// Returns the AST node for this function.
6557+
pub(crate) fn node(self, db: &'db dyn Db, file: File) -> &'db ast::StmtFunctionDef {
6558+
debug_assert_eq!(
6559+
file,
6560+
self.file(db),
6561+
"FunctionType::node() must be called with the same file as the one where \
6562+
the function is defined."
6563+
);
6564+
6565+
self.body_scope(db).node(db).expect_function()
6566+
}
6567+
65496568
/// Returns the [`FileRange`] of the function's name.
65506569
pub fn focus_range(self, db: &dyn Db) -> FileRange {
65516570
FileRange::new(
6552-
self.body_scope(db).file(db),
6571+
self.file(db),
65536572
self.body_scope(db).node(db).expect_function().name.range,
65546573
)
65556574
}
65566575

65576576
pub fn full_range(self, db: &dyn Db) -> FileRange {
65586577
FileRange::new(
6559-
self.body_scope(db).file(db),
6578+
self.file(db),
65606579
self.body_scope(db).node(db).expect_function().range,
65616580
)
65626581
}
65636582

6583+
/// Returns the [`Definition`] of this function.
6584+
///
6585+
/// ## Warning
6586+
///
6587+
/// This uses the semantic index to find the definition of the function. This means that if the
6588+
/// calling query is not in the same file as this function is defined in, then this will create
6589+
/// a cross-module dependency directly on the full AST which will lead to cache
6590+
/// over-invalidation.
65646591
pub(crate) fn definition(self, db: &'db dyn Db) -> Definition<'db> {
65656592
let body_scope = self.body_scope(db);
65666593
let index = semantic_index(db, body_scope.file(db));

crates/red_knot_python_semantic/src/types/diagnostic.rs

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ pub(crate) fn register_lints(registry: &mut LintRegistryBuilder) {
3737
registry.register_lint(&INVALID_EXCEPTION_CAUGHT);
3838
registry.register_lint(&INVALID_LEGACY_TYPE_VARIABLE);
3939
registry.register_lint(&INVALID_METACLASS);
40+
registry.register_lint(&INVALID_OVERLOAD);
4041
registry.register_lint(&INVALID_PARAMETER_DEFAULT);
4142
registry.register_lint(&INVALID_PROTOCOL);
4243
registry.register_lint(&INVALID_RAISE);
@@ -447,6 +448,49 @@ declare_lint! {
447448
}
448449
}
449450

451+
declare_lint! {
452+
/// ## What it does
453+
/// Checks for various invalid `@overload` usages.
454+
///
455+
/// ## Why is this bad?
456+
/// The `@overload` decorator is used to define functions and methods that accepts different
457+
/// combinations of arguments and return different types based on the arguments passed. This is
458+
/// mainly beneficial for type checkers. But, if the `@overload` usage is invalid, the type
459+
/// checker may not be able to provide correct type information.
460+
///
461+
/// ## Example
462+
///
463+
/// Defining only one overload:
464+
///
465+
/// ```py
466+
/// from typing import overload
467+
///
468+
/// @overload
469+
/// def foo(x: int) -> int: ...
470+
/// def foo(x: int | None) -> int | None:
471+
/// return x
472+
/// ```
473+
///
474+
/// Or, not providing an implementation for the overloaded definition:
475+
///
476+
/// ```py
477+
/// from typing import overload
478+
///
479+
/// @overload
480+
/// def foo() -> None: ...
481+
/// @overload
482+
/// def foo(x: int) -> int: ...
483+
/// ```
484+
///
485+
/// ## References
486+
/// - [Python documentation: `@overload`](https://docs.python.org/3/library/typing.html#typing.overload)
487+
pub(crate) static INVALID_OVERLOAD = {
488+
summary: "detects invalid `@overload` usages",
489+
status: LintStatus::preview("1.0.0"),
490+
default_level: Level::Error,
491+
}
492+
}
493+
450494
declare_lint! {
451495
/// ## What it does
452496
/// Checks for default values that can't be assigned to the parameter's annotated type.

crates/red_knot_python_semantic/src/types/infer.rs

Lines changed: 123 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,8 @@ use super::diagnostic::{
101101
report_invalid_exception_raised, report_invalid_type_checking_constant,
102102
report_non_subscriptable, report_possibly_unresolved_reference,
103103
report_runtime_check_against_non_runtime_checkable_protocol, report_slice_step_size_zero,
104-
report_unresolved_reference, INVALID_METACLASS, INVALID_PROTOCOL, REDUNDANT_CAST,
105-
STATIC_ASSERT_ERROR, SUBCLASS_OF_FINAL_CLASS, TYPE_ASSERTION_FAILURE,
104+
report_unresolved_reference, INVALID_METACLASS, INVALID_OVERLOAD, INVALID_PROTOCOL,
105+
REDUNDANT_CAST, STATIC_ASSERT_ERROR, SUBCLASS_OF_FINAL_CLASS, TYPE_ASSERTION_FAILURE,
106106
};
107107
use super::slots::check_class_slots;
108108
use super::string_annotation::{
@@ -418,7 +418,7 @@ impl<'db> TypeInference<'db> {
418418
.copied()
419419
.or(self.cycle_fallback_type)
420420
.expect(
421-
"definition should belong to this TypeInference region and
421+
"definition should belong to this TypeInference region and \
422422
TypeInferenceBuilder should have inferred a type for it",
423423
)
424424
}
@@ -430,7 +430,7 @@ impl<'db> TypeInference<'db> {
430430
.copied()
431431
.or(self.cycle_fallback_type.map(Into::into))
432432
.expect(
433-
"definition should belong to this TypeInference region and
433+
"definition should belong to this TypeInference region and \
434434
TypeInferenceBuilder should have inferred a type for it",
435435
)
436436
}
@@ -524,6 +524,31 @@ pub(super) struct TypeInferenceBuilder<'db> {
524524
/// The returned types and their corresponding ranges of the region, if it is a function body.
525525
return_types_and_ranges: Vec<TypeAndRange<'db>>,
526526

527+
/// A set of functions that have been defined **and** called in this region.
528+
///
529+
/// This is a set because the same function could be called multiple times in the same region.
530+
/// This is mainly used in [`check_overloaded_functions`] to check an overloaded function that
531+
/// is shadowed by a function with the same name in this scope but has been called before. For
532+
/// example:
533+
///
534+
/// ```py
535+
/// from typing import overload
536+
///
537+
/// @overload
538+
/// def foo() -> None: ...
539+
/// @overload
540+
/// def foo(x: int) -> int: ...
541+
/// def foo(x: int | None) -> int | None: return x
542+
///
543+
/// foo() # An overloaded function that was defined in this scope have been called
544+
///
545+
/// def foo(x: int) -> int:
546+
/// return x
547+
/// ```
548+
///
549+
/// [`check_overloaded_functions`]: TypeInferenceBuilder::check_overloaded_functions
550+
called_functions: FxHashSet<FunctionType<'db>>,
551+
527552
/// The deferred state of inferring types of certain expressions within the region.
528553
///
529554
/// This is different from [`InferenceRegion::Deferred`] which works on the entire definition
@@ -556,6 +581,7 @@ impl<'db> TypeInferenceBuilder<'db> {
556581
index,
557582
region,
558583
return_types_and_ranges: vec![],
584+
called_functions: FxHashSet::default(),
559585
deferred_state: DeferredExpressionState::None,
560586
types: TypeInference::empty(scope),
561587
}
@@ -718,6 +744,7 @@ impl<'db> TypeInferenceBuilder<'db> {
718744

719745
// TODO: Only call this function when diagnostics are enabled.
720746
self.check_class_definitions();
747+
self.check_overloaded_functions();
721748
}
722749

723750
/// Iterate over all class definitions to check that the definition will not cause an exception
@@ -952,6 +979,86 @@ impl<'db> TypeInferenceBuilder<'db> {
952979
}
953980
}
954981

982+
/// Check the overloaded functions in this scope.
983+
///
984+
/// This only checks the overloaded functions that are:
985+
/// 1. Visible publicly at the end of this scope
986+
/// 2. Or, defined and called in this scope
987+
///
988+
/// For (1), this has the consequence of not checking an overloaded function that is being
989+
/// shadowed by another function with the same name in this scope.
990+
fn check_overloaded_functions(&mut self) {
991+
// Collect all the unique overloaded function symbols in this scope. This requires a set
992+
// because an overloaded function uses the same symbol for each of the overloads and the
993+
// implementation.
994+
let overloaded_function_symbols: FxHashSet<_> = self
995+
.types
996+
.declarations
997+
.iter()
998+
.filter_map(|(definition, ty)| {
999+
// Filter out function literals that result from anything other than a function
1000+
// definition e.g., imports which would create a cross-module AST dependency.
1001+
if !matches!(definition.kind(self.db()), DefinitionKind::Function(_)) {
1002+
return None;
1003+
}
1004+
let function = ty.inner_type().into_function_literal()?;
1005+
if function.has_known_decorator(self.db(), FunctionDecorators::OVERLOAD) {
1006+
Some(definition.symbol(self.db()))
1007+
} else {
1008+
None
1009+
}
1010+
})
1011+
.collect();
1012+
1013+
let use_def = self
1014+
.index
1015+
.use_def_map(self.scope().file_scope_id(self.db()));
1016+
1017+
let mut public_functions = FxHashSet::default();
1018+
1019+
for symbol in overloaded_function_symbols {
1020+
if let Symbol::Type(Type::FunctionLiteral(function), Boundness::Bound) =
1021+
symbol_from_bindings(self.db(), use_def.public_bindings(symbol))
1022+
{
1023+
if function.file(self.db()) != self.file() {
1024+
// If the function is not in this file, we don't need to check it.
1025+
// https://github.com/astral-sh/ruff/pull/17609#issuecomment-2839445740
1026+
continue;
1027+
}
1028+
1029+
// Extend the functions that we need to check with the publicly visible overloaded
1030+
// function. This is always going to be either the implementation or the last
1031+
// overload if the implementation doesn't exists.
1032+
public_functions.insert(function);
1033+
}
1034+
}
1035+
1036+
for function in self.called_functions.union(&public_functions) {
1037+
let Some(overloaded) = function.to_overloaded(self.db()) else {
1038+
continue;
1039+
};
1040+
1041+
// Check that the overloaded function has at least two overloads
1042+
if let [single_overload] = overloaded.overloads.as_slice() {
1043+
let function_node = function.node(self.db(), self.file());
1044+
if let Some(builder) = self
1045+
.context
1046+
.report_lint(&INVALID_OVERLOAD, &function_node.name)
1047+
{
1048+
let mut diagnostic = builder.into_diagnostic(format_args!(
1049+
"Overloaded function `{}` requires at least two overloads",
1050+
&function_node.name
1051+
));
1052+
diagnostic.annotate(
1053+
self.context
1054+
.secondary(single_overload.focus_range(self.db()))
1055+
.message(format_args!("Only one overload defined here")),
1056+
);
1057+
}
1058+
}
1059+
}
1060+
}
1061+
9551062
fn infer_region_definition(&mut self, definition: Definition<'db>) {
9561063
match definition.kind(self.db()) {
9571064
DefinitionKind::Function(function) => {
@@ -4299,6 +4406,18 @@ impl<'db> TypeInferenceBuilder<'db> {
42994406
let mut call_arguments = Self::parse_arguments(arguments);
43004407
let callable_type = self.infer_expression(func);
43014408

4409+
if let Type::FunctionLiteral(function) = callable_type {
4410+
// Make sure that the `function.definition` is only called when the function is defined
4411+
// in the same file as the one we're currently inferring the types for. This is because
4412+
// the `definition` method accesses the semantic index, which could create a
4413+
// cross-module AST dependency.
4414+
if function.file(self.db()) == self.file()
4415+
&& function.definition(self.db()).scope(self.db()) == self.scope()
4416+
{
4417+
self.called_functions.insert(function);
4418+
}
4419+
}
4420+
43024421
// It might look odd here that we emit an error for class-literals but not `type[]` types.
43034422
// But it's deliberate! The typing spec explicitly mandates that `type[]` types can be called
43044423
// even though class-literals cannot. This is because even though a protocol class `SomeProtocol`

0 commit comments

Comments
 (0)