Skip to content

Conversation

@dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Apr 24, 2025

Summary

Part of #15383, this PR adds the core infrastructure to check for invalid overloads and adds a diagnostic to raise if there are < 2 overloads for a given definition.

Design notes

The requirements to check the overloads are:

  • Requires FunctionType which has the to_overloaded method
  • The FunctionType should be for the function that is either the implementation or the last overload if the implementation doesn't exists
  • Avoid checking any FunctionType that are part of an overload chain
  • Consider visibility constraints

This required a couple of iteration to make sure all of the above requirements are fulfilled.

1. Use a set to deduplicate

The logic would first collect all the FunctionType that are part of the overload chain except for the implementation or the last overload if the implementation doesn't exists. Then, when iterating over all the function declarations within the scope, we'd avoid checking these functions. But, this approach would fail to consider visibility constraints as certain overloads can be behind a version check. Those aren't part of the overload chain but those aren't a separate overload chain either.

Implementation:

fn check_overloaded_functions(&mut self) {
    let function_definitions = || {
        self.types
            .declarations
            .iter()
            .filter_map(|(definition, ty)| {
                // Filter out function literals that result from anything other than a function
                // definition e.g., imports.
                if let DefinitionKind::Function(function) = definition.kind(self.db()) {
                    ty.inner_type()
                        .into_function_literal()
                        .map(|ty| (ty, definition.symbol(self.db()), function.node()))
                } else {
                    None
                }
            })
    };

    // A set of all the functions that are part of an overloaded function definition except for
    // the implementation function and the last overload in case the implementation doesn't
    // exists. This allows us to collect all the function definitions that needs to be skipped
    // when checking for invalid overload usages.
    let mut overloads: HashSet<FunctionType<'db>> = HashSet::default();

    for (function, _) in function_definitions() {
        let Some(overloaded) = function.to_overloaded(self.db()) else {
            continue;
        };
        if overloaded.implementation.is_some() {
            overloads.extend(overloaded.overloads.iter().copied());
        } else if let Some((_, previous_overloads)) = overloaded.overloads.split_last() {
            overloads.extend(previous_overloads.iter().copied());
        }
    }

    for (function, function_node) in function_definitions() {
        let Some(overloaded) = function.to_overloaded(self.db()) else {
            continue;
        };
        if overloads.contains(&function) {
            continue;
        }

        // At this point, the `function` variable is either the implementation function or the
        // last overloaded function if the implementation doesn't exists.

        if overloaded.overloads.len() < 2 {
            if let Some(builder) = self
                .context
                .report_lint(&INVALID_OVERLOAD, &function_node.name)
            {
                let mut diagnostic = builder.into_diagnostic(format_args!(
                    "Function `{}` requires at least two overloads",
                    &function_node.name
                ));
                if let Some(first_overload) = overloaded.overloads.first() {
                    diagnostic.annotate(
                        self.context
                            .secondary(first_overload.focus_range(self.db()))
                            .message(format_args!("Only one overload defined here")),
                    );
                }
            }
        }
    }
 }

2. Define a predecessor query

The predecessor query would return the previous FunctionType for the given FunctionType i.e., the current logic would be extracted to be a query instead. This could then be used to make sure that we're checking the entire overload chain once. The way this would've been implemented is to have a to_overloaded implementation which would take the root of the overload chain instead of the leaf. But, this would require updates to the use-def map to somehow be able to return the following functions for a given definition.

3. Create a successor link

This is what Pyrefly uses, we'd create a forward link between two functions that are involved in an overload chain. This means that for a given function, we can get the successor function. This could be used to find the leaf of the overload chain which can then be used with the to_overloaded method to get the entire overload chain. But, this would also require updating the use-def map to be able to "see" the following function.

Implementation

This leads us to the final implementation that this PR implements which is to consider the overloaded functions using:

  • Collect all the function symbols that are defined and called within the same file. This could potentially be an overloaded function
  • Use the public bindings to get the leaf of the overload chain and use that to get the entire overload chain via to_overloaded and perform the check

This has a limitation that in case a function redefines an overload, then that overload will not be checked. For example:

from typing import overload

@overload
def f() -> None: ...
@overload
def f(x: int) -> int: ...

# The above overload will not be checked as the below function with the same name
# shadows it

def f(*args: int) -> int: ...

Test Plan

Update existing mdtest and add snapshot diagnostics.

@dhruvmanila dhruvmanila added the ty Multi-file analysis & type inference label Apr 24, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Apr 24, 2025

CodSpeed Performance Report

Merging #17609 will not alter performance

Comparing dhruv/check-invalid-overloads (382b15b) with main (7d46579)

Summary

✅ 33 untouched benchmarks

@dhruvmanila dhruvmanila force-pushed the dhruv/final-staticmethod-override branch 3 times, most recently from 832935a to 0cdf087 Compare April 24, 2025 21:10
@dhruvmanila dhruvmanila force-pushed the dhruv/check-invalid-overloads branch from 251e321 to 44601fc Compare April 24, 2025 21:31
@github-actions
Copy link
Contributor

github-actions bot commented Apr 24, 2025

mypy_primer results

No ecosystem changes detected ✅

Base automatically changed from dhruv/final-staticmethod-override to main April 24, 2025 21:45
@dhruvmanila dhruvmanila force-pushed the dhruv/check-invalid-overloads branch from 44601fc to ea4e39e Compare April 24, 2025 21:45
@dhruvmanila dhruvmanila reopened this Apr 25, 2025
@dhruvmanila dhruvmanila force-pushed the dhruv/check-invalid-overloads branch 2 times, most recently from f5ef315 to 1e31eea Compare April 25, 2025 14:01
@dhruvmanila dhruvmanila marked this pull request as ready for review April 25, 2025 14:18
@dhruvmanila
Copy link
Member Author

This PR only adds a single check, I'm thinking of creating a "help wanted" issue that lists down the other checks as the core infrastructure is now in place.

@dhruvmanila dhruvmanila requested a review from carljm April 25, 2025 14:18
@AlexWaygood
Copy link
Member

mypy_primer results

Changes were detected when running on open source projects

pytest-robotframework (https://github.com/detachhead/pytest-robotframework)
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/pytest-robotframework/pytest_robotframework/__init__.py:327:9: Function `__call__` requires at least two overloads
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/pytest-robotframework/pytest_robotframework/__init__.py:470:5: Function `keyword` requires at least two overloads
- Found 357 diagnostics
+ Found 359 diagnostics

pydantic (https://github.com/pydantic/pydantic)
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/pydantic/pydantic/json_schema.py:2539:9: Function `__init__` requires at least two overloads
- Found 923 diagnostics
+ Found 924 diagnostics

setuptools (https://github.com/pypa/setuptools)
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/setuptools/setuptools/_distutils/sysconfig.py:578:5: Function `get_config_var` requires at least two overloads
- Found 1084 diagnostics
+ Found 1085 diagnostics

django-stubs (https://github.com/typeddjango/django-stubs)
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/django-stubs/django-stubs/db/models/constraints.pyi:61:9: Function `__init__` requires at least two overloads
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/django-stubs/django-stubs/utils/html.pyi:27:5: Function `format_html` requires at least two overloads
+ error[lint:invalid-overload] /tmp/mypy_primer/projects/django-stubs/django-stubs/contrib/admin/options.pyi:136:9: Function `lookup_allowed` requires at least two overloads
- Found 1415 diagnostics
+ Found 1418 diagnostics

These all look like false positives where one of the overloads is additionally decorated with @deprecated. I guess that's because deprecated is actually a class (@deprecated created a callable instance of typing_extensions.deprecated), and deprecated.__call__ is a generic function, and we don't understand generic functions that use old-style TypeVars yet.

@dhruvmanila
Copy link
Member Author

These all look like false positives where one of the overloads is additionally decorated with @deprecated. I guess that's because deprecated is actually a class (@deprecated created a callable instance of typing_extensions.deprecated), and deprecated.__call__ is a generic function, and we don't understand generic functions that use old-style TypeVars yet.

Yes, sorry I meant to add that to the PR description as I looked at them but somehow missed to do so. Thank you for looking at it.

We don't support @deprecated decorator yet. In general, I think this issue will pop-up if there are any other non-special cased decorator that are present in addition to @overload because the inferred type would change to something other than FunctionType which is then not included in the overload chain.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@dhruvmanila dhruvmanila force-pushed the dhruv/check-invalid-overloads branch from 3fbeae4 to 8a219b5 Compare April 29, 2025 13:45
@dhruvmanila
Copy link
Member Author

I looked at the mypy-primer failure in #17681 which is on porcupine project, it started happening after #16538 was merged. It seems that for the following code:

from typing import Any, Callable, TypeVar

_T = TypeVar("_T")


# https://github.com/python/typing/issues/769
def copy_type(f: _T) -> Callable[[Any], _T]:
    return lambda x: x


@copy_type(open)
def backup_open():
    pass

The inference builder binds the backup_open definition to the builtin open function which looks correct. This means that the overload check for backup_open which is defined in the current scope is actually being done on open which is defined elsewhere.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@dhruvmanila dhruvmanila merged commit ad1a8da into main Apr 30, 2025
34 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/check-invalid-overloads branch April 30, 2025 14:07
@dhruvmanila dhruvmanila added the diagnostics Related to reporting of diagnostics. label Apr 30, 2025
dcreager added a commit that referenced this pull request May 1, 2025
* main:
  [red-knot] Preliminary `NamedTuple` support (#17738)
  [red-knot] Add tests for classes that have incompatible `__new__` and `__init__` methods (#17747)
  Update dependency vite to v6.2.7 (#17746)
  [red-knot] Update call binding to return all matching overloads (#17618)
  [`airflow`] apply Replacement::AutoImport to `AIR312` (#17570)
  [`ruff`] Add fix safety section (`RUF028`) (#17722)
  [syntax-errors] Detect single starred expression assignment `x = *y` (#17624)
  py-fuzzer: fix minimization logic when `--only-new-bugs` is passed (#17739)
  Fix example syntax for pydocstyle ignore_var_parameters option (#17740)
  [red-knot] Update salsa to prevent panic in custom panic-handler (#17742)
  [red-knot] Ban direct instantiation of generic protocols as well as non-generic ones (#17741)
  [red-knot] Lookup of `__new__` (#17733)
  [red-knot] Check decorator consistency on overloads (#17684)
  [`flake8-use-pathlib`] Avoid suggesting `Path.iterdir()` for `os.listdir` with file descriptor (`PTH208`) (#17715)
  [red-knot] Check overloads without an implementation (#17681)
  Expand Semantic Syntax Coverage (#17725)
  [red-knot] Check for invalid overload usages (#17609)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diagnostics Related to reporting of diagnostics. ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants