Skip to content

Conversation

@sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Aug 5, 2025

Summary

This PR adds type inference for key-based access on TypedDicts and a new diagnostic for invalid subscript accesses:

class Person(TypedDict):
    name: str
    age: int | None

alice = Person(name="Alice", age=25)

reveal_type(alice["name"])  # revealed: str
reveal_type(alice["age"])  # revealed: int | None

And when you try to access a non-existing key:

image

part of astral-sh/ty#154

Test Plan

Updated Markdown tests

@sharkdp sharkdp added ty Multi-file analysis & type inference ecosystem-analyzer labels Aug 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2025

mypy_primer results

Changes were detected when running on open source projects
discord.py (https://github.com/Rapptz/discord.py)
- discord/soundboard.py:217:70: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- discord/state.py:1800:34: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- discord/state.py:1806:93: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- discord/webhook/async_.py:764:34: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- discord/webhook/async_.py:766:100: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 531 diagnostics
+ Found 526 diagnostics

pydantic (https://github.com/pydantic/pydantic)
+ pydantic/_internal/_config.py:160:44: error[invalid-key] TypedDict `ConfigDict` cannot be indexed with a key of type `str`
+ pydantic/_internal/_generate_schema.py:2762:21: error[unresolved-attribute] Type `DefinitionReferenceSchema & ~None` has no attribute `clear`
+ pydantic/_internal/_generate_schema.py:2769:21: error[unresolved-attribute] Type `DefinitionReferenceSchema & ~None` has no attribute `clear`
+ pydantic/deprecated/config.py:22:43: error[invalid-key] TypedDict `ConfigDict` cannot be indexed with a key of type `str`
+ pydantic/json_schema.py:2054:47: error[invalid-key] Invalid key access on TypedDict `IncExSeqSerSchema`: Unknown key "schema"
+ pydantic/json_schema.py:2054:47: error[invalid-key] Invalid key access on TypedDict `IncExDictSerSchema`: Unknown key "schema"
- Found 766 diagnostics
+ Found 772 diagnostics

cloud-init (https://github.com/canonical/cloud-init)
- cloudinit/netinfo.py:598:25: warning[possibly-unbound-attribute] Attribute `get` on type `str | @Todo(Support for `TypedDict`)` is possibly unbound
+ cloudinit/netinfo.py:598:25: warning[possibly-unbound-attribute] Attribute `get` on type `str | Unknown` is possibly unbound
- cloudinit/netinfo.py:611:25: warning[possibly-unbound-attribute] Attribute `get` on type `str | @Todo(Support for `TypedDict`)` is possibly unbound
+ cloudinit/netinfo.py:611:25: warning[possibly-unbound-attribute] Attribute `get` on type `str | Unknown` is possibly unbound
+ cloudinit/sources/DataSourceVMware.py:1000:48: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
+ cloudinit/sources/DataSourceVMware.py:1001:26: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
+ cloudinit/sources/DataSourceVMware.py:1008:48: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
+ cloudinit/sources/DataSourceVMware.py:1009:26: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
+ cloudinit/sources/DataSourceVMware.py:1018:53: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
+ cloudinit/sources/DataSourceVMware.py:1028:53: error[invalid-key] TypedDict `Interface` cannot be indexed with a key of type `Literal[0]`
- Found 607 diagnostics
+ Found 613 diagnostics

freqtrade (https://github.com/freqtrade/freqtrade)
+ freqtrade/optimize/optimize_reports/optimize_reports.py:503:9: error[invalid-argument-type] Argument to function `generate_pair_metrics` is incorrect: Expected `DataFrame`, found `Series[Any] | Unknown`
+ freqtrade/optimize/optimize_reports/optimize_reports.py:514:91: error[invalid-argument-type] Argument to function `generate_all_periodic_breakdown_stats` is incorrect: Expected `list[Unknown]`, found `DataFrame`
+ freqtrade/optimize/optimize_reports/optimize_reports.py:552:48: error[invalid-argument-type] Argument to function `calculate_trade_volume` is incorrect: Expected `list[dict[str, Any]]`, found `list[dict[Hashable, Any]]`
- Found 375 diagnostics
+ Found 378 diagnostics

paasta (https://github.com/yelp/paasta)
+ paasta_tools/setup_prometheus_adapter_config.py:734:8: error[unsupported-operator] Operator `in` is not supported for types `str` and `None`, in comparing `Literal["seriesQuery"]` with `dict[Unknown, Unknown] | None`
+ paasta_tools/setup_prometheus_adapter_config.py:737:24: error[non-subscriptable] Cannot subscript object of type `None` with no `__getitem__` method
+ paasta_tools/setup_prometheus_adapter_config.py:738:25: error[non-subscriptable] Cannot subscript object of type `None` with no `__getitem__` method
+ paasta_tools/setup_prometheus_adapter_config.py:753:22: error[non-subscriptable] Cannot subscript object of type `None` with no `__getitem__` method
+ paasta_tools/setup_prometheus_adapter_config.py:772:22: warning[possibly-unbound-attribute] Attribute `get` on type `dict[Unknown, Unknown] | None` is possibly unbound
- Found 883 diagnostics
+ Found 888 diagnostics

altair (https://github.com/vega/altair)
+ tests/vegalite/v6/test_api.py:558:17: error[invalid-key] Invalid key access on TypedDict `_Value`: Unknown key "condition"
+ tests/vegalite/v6/test_api.py:604:32: error[invalid-key] Invalid key access on TypedDict `_Value`: Unknown key "condition"
- Found 1304 diagnostics
+ Found 1306 diagnostics

rclip (https://github.com/yurijmikhalevich/rclip)
+ rclip/main.py:38:13: error[invalid-key] TypedDict `ImageMeta` cannot be indexed with a key of type `str`
+ rclip/main.py:38:27: error[invalid-key] TypedDict `Image` cannot be indexed with a key of type `str`
- Found 12 diagnostics
+ Found 14 diagnostics

meson (https://github.com/mesonbuild/meson)
+ mesonbuild/dependencies/dev.py:589:75: error[invalid-argument-type] Argument to function `version_compare_many` is incorrect: Expected `Iterable[str]`, found `str | None`
+ mesonbuild/interpreter/interpreter.py:846:65: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `bool`, found `bool | None`
+ mesonbuild/interpreter/interpreter.py:1779:98: error[invalid-key] Invalid key access on TypedDict `FindProgram`: Unknown key "version_argument"
+ mesonbuild/interpreter/interpreter.py:1820:29: error[invalid-key] Invalid key access on TypedDict `FuncDependency`: Unknown key "include_type"
+ mesonbuild/interpreter/interpreter.py:2286:29: error[invalid-key] Invalid key access on TypedDict `BaseTest`: Unknown key "args"
+ mesonbuild/interpreter/interpreter.py:2291:29: error[invalid-key] Invalid key access on TypedDict `BaseTest`: Unknown key "protocol"
+ mesonbuild/interpreter/interpreter.py:2293:29: error[invalid-key] Invalid key access on TypedDict `BaseTest`: Unknown key "verbose"
+ mesonbuild/interpreter/interpreter.py:2336:19: error[invalid-key] Invalid key access on TypedDict `FuncInstallHeaders`: Unknown key "preserve_path"
+ mesonbuild/interpreter/interpreter.py:2521:23: error[invalid-key] Invalid key access on TypedDict `FuncInstallData`: Unknown key "preserve_path"
+ mesonbuild/interpreter/interpreter.py:2527:90: error[invalid-key] Invalid key access on TypedDict `FuncInstallData`: Unknown key "install_tag" - did you mean "install_dir"?
+ mesonbuild/interpreter/interpreter.py:2528:60: error[invalid-key] Invalid key access on TypedDict `FuncInstallData`: Unknown key "preserve_path"
+ mesonbuild/interpreter/interpreter.py:2595:32: error[invalid-key] Invalid key access on TypedDict `FuncInstallSubdir`: Unknown key "install_tag" - did you mean "install_dir"?
- mesonbuild/interpreter/interpreter.py:2704:47: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `@Todo(Support for `TypedDict`) | str | ExternalProgram`
+ mesonbuild/interpreter/interpreter.py:2704:47: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `str | ExternalProgram`
+ mesonbuild/interpreter/interpreter.py:2711:69: error[invalid-argument-type] Argument to function `do_conf_file` is incorrect: Expected `ConfigurationData`, found `(dict[str, str | int] & ~dict[Unknown, Unknown]) | ConfigurationData`
+ mesonbuild/interpreter/interpreter.py:2728:54: error[invalid-argument-type] Argument to function `dump_conf_header` is incorrect: Expected `ConfigurationData`, found `(dict[str, str | int] & ~dict[Unknown, Unknown]) | ConfigurationData`
+ mesonbuild/interpreter/interpreter.py:2729:13: error[invalid-assignment] Object of type `Literal[True]` is not assignable to attribute `used` on type `(dict[str, str | int] & ~dict[Unknown, Unknown]) | ConfigurationData`
+ mesonbuild/interpreter/interpreter.py:2741:47: error[invalid-argument-type] Argument to function `substitute_values` is incorrect: Expected `list[str | ExternalProgram]`, found `list[Executable | ExternalProgram | Compiler | File | str]`
- mesonbuild/interpreter/interpreter.py:2742:47: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `@Todo(Support for `TypedDict`) | str | ExternalProgram`
+ mesonbuild/interpreter/interpreter.py:2742:47: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `str | ExternalProgram`
- mesonbuild/interpreter/interpreter.py:2756:56: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `(@Todo(Support for `TypedDict`) & ~AlwaysTruthy & ~AlwaysFalsy) | (str & ~AlwaysFalsy) | (ExternalProgram & ~AlwaysTruthy & ~AlwaysFalsy)`
+ mesonbuild/interpreter/interpreter.py:2756:56: error[invalid-argument-type] Argument to function `bold` is incorrect: Expected `str`, found `(str & ~AlwaysFalsy) | (ExternalProgram & ~AlwaysTruthy & ~AlwaysFalsy)`
+ mesonbuild/interpreter/interpreter.py:2763:21: error[invalid-key] Invalid key access on TypedDict `ConfigureFile`: Unknown key "copy"
+ mesonbuild/interpreter/interpreter.py:3418:24: error[invalid-key] Invalid key access on TypedDict `Executable`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3418:24: error[invalid-key] Invalid key access on TypedDict `StaticLibrary`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3418:24: error[invalid-key] Invalid key access on TypedDict `SharedLibrary`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3418:24: error[invalid-key] Invalid key access on TypedDict `SharedModule`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3418:24: error[invalid-key] Invalid key access on TypedDict `Jar`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3423:24: error[invalid-key] Invalid key access on TypedDict `Executable`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3423:24: error[invalid-key] Invalid key access on TypedDict `StaticLibrary`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3423:24: error[invalid-key] Invalid key access on TypedDict `SharedLibrary`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3423:24: error[invalid-key] Invalid key access on TypedDict `SharedModule`: Unknown key "language_args"
+ mesonbuild/interpreter/interpreter.py:3423:24: error[invalid-key] Invalid key access on TypedDict `Jar`: Unknown key "language_args"
+ mesonbuild/modules/cmake.py:331:109: error[invalid-argument-type] Argument is incorrect: Expected `str`, found `str | None`
+ mesonbuild/modules/cmake.py:409:9: error[invalid-assignment] Object of type `Literal[True]` is not assignable to attribute `used` on type `ConfigurationData | (dict[Unknown, Unknown] & ~dict[Unknown, Unknown])`
- mesonbuild/modules/gnome.py:1554:55: warning[possibly-unbound-attribute] Attribute `get_target_dir` on type `Unknown | Backend | None` is possibly unbound
- mesonbuild/modules/gnome.py:1600:24: warning[possibly-unbound-attribute] Attribute `get_executable_serialisation` on type `Unknown | Backend | None` is possibly unbound
- mesonbuild/modules/pkgconfig.py:733:103: error[invalid-argument-type] Argument is incorrect: Expected `str`, found `(@Todo(Support for `TypedDict`) & ~AlwaysFalsy) | None | Unknown | str`
+ mesonbuild/modules/pkgconfig.py:733:103: error[invalid-argument-type] Argument is incorrect: Expected `str`, found `str | None | Unknown`
+ mesonbuild/modules/python.py:291:34: error[invalid-key] Invalid key access on TypedDict `PyInstallKw`: Unknown key "preserve_path"
- Found 776 diagnostics
+ Found 804 diagnostics

openlibrary (https://github.com/internetarchive/openlibrary)
+ openlibrary/plugins/worksearch/code.py:406:37: error[invalid-argument-type] Argument to function `urlsafe` is incorrect: Expected `str`, found `str | None`
+ openlibrary/plugins/worksearch/code.py:410:13: warning[possibly-unbound-attribute] Attribute `split` on type `str | None` is possibly unbound
+ openlibrary/plugins/worksearch/code.py:454:33: error[invalid-key] TypedDict `SolrDocument` cannot be indexed with a key of type `str`
- openlibrary/solr/updater/edition.py:346:37: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- openlibrary/solr/updater/edition.py:348:57: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 697 diagnostics
+ Found 698 diagnostics

zulip (https://github.com/zulip/zulip)
+ corporate/views/support.py:582:35: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `Realm | None`, found `Self@Model`
+ corporate/views/support.py:894:52: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `RemoteRealm`, found `Self@Model`
+ corporate/views/support.py:894:65: warning[possibly-unresolved-reference] Name `remote_realm` used when possibly not defined
+ corporate/views/support.py:902:52: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `RemoteZulipServer`, found `Self@Model`
+ corporate/views/support.py:902:66: warning[possibly-unresolved-reference] Name `remote_server` used when possibly not defined
- Found 7420 diagnostics
+ Found 7425 diagnostics
Memory usage changes were detected when running on open source projects
prefect (https://github.com/PrefectHQ/prefect)
-     memo metadata = ~76MB
+     memo metadata = ~80MB

@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2025

ecosystem-analyzer results

Lint rule Added Removed Changed
invalid-key 37 0 0
invalid-argument-type 13 0 4
unused-ignore-comment 0 7 0
possibly-unbound-attribute 2 2 2
non-subscriptable 3 0 0
invalid-assignment 2 0 0
possibly-unresolved-reference 2 0 0
unresolved-attribute 2 0 0
unsupported-operator 1 0 0
Total 62 9 6

Full report with detailed diff

memchr = { workspace = true }
strum = { workspace = true }
strum_macros = { workspace = true }
strsim = "0.11.1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that we had a custom Levenshtein implementation in #18705, but that was removed again. And strsim is already a transitive dependency for the CLI version of ty at least — via clap.

Happy to replace that with something else though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no opinions here -- I'm in favor of whatever works and is implemented :)

Copy link
Member

@AlexWaygood AlexWaygood Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of going with the custom implementation in #18705 (which I think it would be fairly easy to bring back -- it's quite isolated as a module) is that Brent and I based it directly on the CPython implementation of this feature, and we brought in most of CPython's tests for this feature. CPython's implementation of this feature is very battle-tested at this point: it's been present in several stable releases of Python and initially received a large number of bug reports (which have since been fixed) regarding bad "Did you mean?" suggestions. So at this point I think we can be pretty confident that CPython's implementation is very well tuned for giving good suggestions for typos in Python code specifically.

Having said that, it's obviously nice for us to have to maintain less code, and exactly which Levenshtein implementation we go with probably isn't the most important issue for us right now :-)

}

/// Suggest a name from `existing_names` that is similar to `wrong_name`.
pub(super) fn did_you_mean<S: AsRef<str>, T: AsRef<str>>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this function more or less 1:1 from another project of mine. It's not like I have a huge amount of experience with the heuristic here, but I did some iterations on it and it usually gives decent results.

@github-actions

This comment was marked as resolved.

@sharkdp sharkdp force-pushed the david/typeddict-getitem branch from ff702fb to 574bff2 Compare August 5, 2025 19:51
@sharkdp sharkdp marked this pull request as ready for review August 5, 2025 19:52
reveal_type(alice["age"]) # revealed: Unknown

# TODO: this should reveal `Unknown`, and it should emit an error
reveal_type(alice["non_existing"]) # revealed: Unknown
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So.. all of these still don't work, because the type of alice here is simply dict[Unknown, Unknown]. See below for the real tests for this feature.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!!

memchr = { workspace = true }
strum = { workspace = true }
strum_macros = { workspace = true }
strsim = "0.11.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no opinions here -- I'm in favor of whatever works and is implemented :)

Comment on lines +2044 to +2056
let overloads = fields.iter().map(|(name, field)| {
let key_type = Type::StringLiteral(StringLiteralType::new(db, name.as_str()));

Signature::new(
Parameters::new([
Parameter::positional_only(Some(Name::new_static("self")))
.with_annotated_type(instance_ty),
Parameter::positional_only(Some(Name::new_static("key")))
.with_annotated_type(key_type),
]),
Some(field.declared_ty),
)
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the synthesized-overloads approach a lot. I seem to remember @erictraut mentioning before that neither pyright nor mypy implement TypedDict __getitem__ or __setitem__ by synthesizing overloads. I'm curious if you've looked at the more advanced TypedDict features we have yet to implement, and considered whether this approach will be able to handle those features also?

Copy link
Contributor Author

@sharkdp sharkdp Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the other methods here (get, update, etc.) can be handled in a similar way to __getitem__.

For validating writes to attributes, we should be able to synthesize a similar overload set for __setitem__.

I do not yet have a good understanding of all TypedDict features to come (I merely read the docs/spec, and wrote a list of all tasks). So there might very well be things that can not be handled this way. But I currently don't see why that would be a reason not to solve those basic features with synthesized overloads.

My biggest worry is that this approach here does not scale well to large TypedDicts. Synthesizing N overloads is probably not that bad, but resolving calls to these overloads probably scales superlinear in N?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest worry is that this approach here does not scale well to large TypedDicts. Synthesizing N overloads is probably not that bad, but resolving calls to these overloads probably scales superlinear in N?

I think we just need to see how this goes. We can always change the approach later if it proves a problem in practice.

For tuple.__getitem__, I attempted to mitigate this problem by synthesizing the minimum number of overloads. This is done by combining them where possible. E.g. for tuple[int, str, int], we "only" synthesize these overloads -- the overloads for the first element and the third element are combined:

@overload
def __getitem__(self, index: Literal[0, 2, -1, -3], /) -> int: ...
@overload
def __getitem__(self, index: Literal[1, -2], /) -> str: ...
@overload
def __getitem__(self, index: SupportsIndex, /) -> int | str: ...
@overload
def __getitem__(self, index: slice, /) -> tuple[int | str, ...]: ...

You could do a similar thing for TypedDict __getitem__ methods: for Foo in the following example, it looks like we synthesize two overloads, but really only one is required (they can be combined, since the value type is the same for both keys:

from typing import TypedDict

class Foo(TypedDict):
    x: str
    y: str

Copy link
Contributor Author

@sharkdp sharkdp Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a great optimization idea. And potentially more impactful than for tuples (as I would assume TypedDicts to have more items, on average). There's probably also a lot of TypedDicts out there where the majority of value types are just int or str.

What's not immediately clear to me is if this is an optimization at all? We'll spend more time when synthesizing the overloads, and we have to allocate an additional hashmap to group the items. And it's not obvious to me that this will even lead to faster overload resolution? It might be that creating / iterating over these union types is more expensive than a larger number of overloads with simple Literal["key"] argument types? I guess we'll need a micro-benchmark with a large TypedDict.

I wrote down two tasks in astral-sh/ty#154

@sharkdp
Copy link
Contributor Author

sharkdp commented Aug 6, 2025

Looked again at almost all ecosystem changes and they're basically all true positives (a few known limitations unrelated to TypedDicts)!

@sharkdp sharkdp merged commit 4887bdf into main Aug 6, 2025
38 checks passed
@sharkdp sharkdp deleted the david/typeddict-getitem branch August 6, 2025 07:36
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!

Comment on lines +2044 to +2056
let overloads = fields.iter().map(|(name, field)| {
let key_type = Type::StringLiteral(StringLiteralType::new(db, name.as_str()));

Signature::new(
Parameters::new([
Parameter::positional_only(Some(Name::new_static("self")))
.with_annotated_type(instance_ty),
Parameter::positional_only(Some(Name::new_static("key")))
.with_annotated_type(key_type),
]),
Some(field.declared_ty),
)
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest worry is that this approach here does not scale well to large TypedDicts. Synthesizing N overloads is probably not that bad, but resolving calls to these overloads probably scales superlinear in N?

I think we just need to see how this goes. We can always change the approach later if it proves a problem in practice.

For tuple.__getitem__, I attempted to mitigate this problem by synthesizing the minimum number of overloads. This is done by combining them where possible. E.g. for tuple[int, str, int], we "only" synthesize these overloads -- the overloads for the first element and the third element are combined:

@overload
def __getitem__(self, index: Literal[0, 2, -1, -3], /) -> int: ...
@overload
def __getitem__(self, index: Literal[1, -2], /) -> str: ...
@overload
def __getitem__(self, index: SupportsIndex, /) -> int | str: ...
@overload
def __getitem__(self, index: slice, /) -> tuple[int | str, ...]: ...

You could do a similar thing for TypedDict __getitem__ methods: for Foo in the following example, it looks like we synthesize two overloads, but really only one is required (they can be combined, since the value type is the same for both keys:

from typing import TypedDict

class Foo(TypedDict):
    x: str
    y: str

memchr = { workspace = true }
strum = { workspace = true }
strum_macros = { workspace = true }
strsim = "0.11.1"
Copy link
Member

@AlexWaygood AlexWaygood Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of going with the custom implementation in #18705 (which I think it would be fairly easy to bring back -- it's quite isolated as a module) is that Brent and I based it directly on the CPython implementation of this feature, and we brought in most of CPython's tests for this feature. CPython's implementation of this feature is very battle-tested at this point: it's been present in several stable releases of Python and initially received a large number of bug reports (which have since been fixed) regarding bad "Did you mean?" suggestions. So at this point I think we can be pretty confident that CPython's implementation is very well tuned for giving good suggestions for typos in Python code specifically.

Having said that, it's obviously nice for us to have to maintain less code, and exactly which Levenshtein implementation we go with probably isn't the most important issue for us right now :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ecosystem-analyzer ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants