Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache types.GenericAlias in getitems #101859

Open
eltoder opened this issue Feb 13, 2023 · 9 comments
Open

Cache types.GenericAlias in getitems #101859

eltoder opened this issue Feb 13, 2023 · 9 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir topic-typing type-feature A feature request or enhancement

Comments

@eltoder
Copy link
Contributor

eltoder commented Feb 13, 2023

The "old-style" generic classes from the typing module (List, Dict, etc) cached typing._GenericAlias instances retuned from __getitem__:

>>> from typing import List
>>> List[int] is List[int]
True

But when this functionality was ported to built-in types, caching was not implemented:

>>> list[int] is list[int]
False

Since it is very common to subscript the same types with the same arguments, caching will be very beneficial here and will reduce memory footprint of heavily type-annotated modules.

Linked PRs

@eltoder eltoder added the type-feature A feature request or enhancement label Feb 13, 2023
@arhadthedev arhadthedev added stdlib Python modules in the Lib dir topic-typing labels Feb 13, 2023
@sobolevn sobolevn added the performance Performance or resource usage label Feb 13, 2023
@Fidget-Spinner
Copy link
Member

Oh yeah. I think @gvanrossum envisioned that at some point https://github.com/python/cpython/blob/main/Objects/genericaliasobject.c#L917

We should definitely implement some type of cache. Even if it doesn't speed things up much, it should save space.

@gvanrossum
Copy link
Member

Agreed.

@sobolevn
Copy link
Member

sobolevn commented Apr 9, 2023

@Fidget-Spinner I am working on this right now, but I have several technical questions.

Usually, we place cache in a module state. But, since genericaliasobject is special, it does not have any module-level state (and should not have one).

Another option is to have a static PyObject *TYPE_CACHE global variable, but:

  1. We try to get rid of those
  2. I am not sure how to clean it up properly

Any ideas? :)

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Apr 9, 2023

Builtin object caches tend to go into interpreter state. See for example the slice cache

PySliceObject *slice_cache;

There's the dict, tuple and other cache also in there.

You would need to setup and clear the state in the interpreter state initializer and destructor (sorry I forgot where those are).

@sobolevn
Copy link
Member

sobolevn commented Apr 9, 2023

Thanks, I was not aware of that! 👍

@AlexWaygood
Copy link
Member

Cc. @samuelcolvin, who just mentioned at the typing summit that the current caching for typing.List etc causes problems for pydantic

@eltoder
Copy link
Contributor Author

eltoder commented Jul 9, 2023

@AlexWaygood Are there any details re what kind of issues?

@samuelcolvin
Copy link
Contributor

The issue is that the order of union members can be cached after the first union is created. Causing problems for runtime type checking where order can matter.

@AlexWaygood
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir topic-typing type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants