-
-
Notifications
You must be signed in to change notification settings - Fork 142
DataFrame with generic type #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is definitely worth thinking about which aspects of a DataFrame should be generic: each column could have its own type, the index could be generic, columns could be generic (they are not always strings!), but making everything generic will create quite some work :). I expect in most cases these arguments will be tricky to handle ("or even worse", require type-checker specific plugins). For example |
We have made For Is there any reason that you couldn't just create your own type |
Thank you for responding! @twoertwin
True that. Also, users may want to cherry pick what to type hint (e.g. the columns, but not the index). Some users may choose to not use the generics at all. For that reason, I think the upcoming variadic generics are interesting.
Agreed. In fact, I think it's impossible for static type checkers (MyPy, Pylance) to fully cover all aspects due to the dynamic nature of pandas types (with methods such as
I can imagine that you are quite occupied already. Do you think it will make it to the backlog eventually? And would you welcome any suggestion from my part (in the form of a pull request or otherwise)?
Unfortunately, there is. My goal is to remain static type checker compliant and by overriding the |
Yes, aside from a few people that have done one or two PRs, there are 3 of us who have been doing most of the work.
We can keep this issue open as something to consider in the future.
We are absolutely open to PRs !! |
One probably ugly workaround could be to create a script on your side to convert the DataFrame stubs into a Protocol. You could then make this Protocol generic. |
One reason for not adding a mostly unused Generic parameter in the stubs is that pyright's |
Makes sense.
Great! I might dive into this when I can find the time for it. As I said, I think the approach should be towards variadic generics, but it will require some investigation on my part.
That's a clever workaround (with an acceptable smell to it indeed 😃 ). One problem with Protocols that I already found, is that at least MyPy will take forever to type check once the Protocol becomes larger. It makes sense: it needs to check every method and attribute separately every time it encounters the Protocol. I tried basically what you proposed for Another workaround I'm now thinking about is to fork this repo and add the generics. I will need to figure out some synchronization process to keep up to date with the work on this repo.
Hmm I see. Maybe that will be resolved when using variadic generics, but I'm not sure. |
The following might be faster but works only for mypy: from typing import (
Protocol,
TypeVar,
)
import pandas as pd
GenericType = TypeVar("GenericType", covariant=True)
class DataFrame(Protocol[GenericType], pd.DataFrame):
...
def foo(x: DataFrame) -> None:
...
foo(pd.DataFrame()) Pyright error's with: |
I would venture a guess here that |
Smart trick. However they seem to have fixed this with MyPy==0.971 (latest). It reports similarly to
I agree that this indeed is expected behavior. |
Hi there!
I'm the developer/maintainer of
nptyping
, a little library for type hinting around numpy (and pandas DataFrame soon).What I'm working on, is to allow a type hint for
pandas.DataFrame
with a "structure" describing the contents:(ignore the underscore for now)
I'd like to make use of ("extend") the stubs of this repo. For that, I would need the DataFrame to take a generic type.
I've done this same thing with
numpy
'sndarray
. This is hownumpy
hint theirndarray
with generics:For
DataFrame
, I was hoping to find something similar, like:This would allow combining these stubs with my type, while keeping the type checkers (MyPy, Pylance, ...) happy. I can imagine though that you may want to wait for Python3.11 with this and use variadic generics. Have you considered / would you consider adding generics to the pandas types?
TLDR
Are there any plans to add generics to the stub for DataFrame (and the other pandas types)?
The text was updated successfully, but these errors were encountered: