-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrame does not play well with classes extending it #2859
Comments
Changed all occurrences of `return DataFrame` to `return self._constructor` and changed the latter to return `self.__class__` rather than `DataFrame` by default. Also made static methods like `DataFrame.from_dict` work on the `cls` passed to them rather than on hard-coded `DataFrame`. This is a minimal quick win on the cases described in Issue pandas-dev#2859. A formal approach to make classes friendly to subclassing might still be needed, see discussion in the case.
In the cases described here it looks trivial, Though I agree that a major refactoring might be needed to do this properly. |
this 'works' but will not change your ability to subclass is there a specific case that sub classing would solve your problem rather than composition? |
Well, I do not think there is a case where subclassing could not be exchanged by composition. In my case I subclass I could easily change the function to be external to the class and take DataFrame as a parameter, I also could use something like def foo(data_frame):
if isinstance(data_frame, DataFrame):
return data_frame.mean()
elif isinstance(data_frame, SomethingElse):
return data_frame.foo()
else:
raise ValueError("foo() expects its input to be either DataFrame or SomethingElse. {0} given".format(type(data_frame)) That is one example where subclassing might be necessary. I think it all boils down to whether the subclass is trying to be a But I do not think that this discussion changes the fact that the code should not be hardcoding the explicit constructor, and using things like I have changed all the occurrences of |
I encourage to read the discussions I pointed to above it is quite straightforward to add a function to do 'x' which makes it look native in python there are no 'protected' methods, u can call whatever you want |
duplicate of those mentioned above. |
@lukauskas would you consider this as a possible solution to your question? http://stackoverflow.com/a/42580274/3027854 |
Hi,
I am running on the git cloned version of
pandas
, and there seems to be quite a few issues with user defined classes extendingDataFrame
.It seems that
DataFrame
class constructor is hardcoded in a lot of places, whereself.__class__
orcls
constructors should be used instead. This causes some weird behaviour.Allow me to illustrate, let's import pandas and define some class that would extend
DataFrame
Note that
ClassExtendingDataFrame
does not override anything and is essentially the sameDataFrame
, just renamed.Now one would expect a new instance of
ClassExtendingDataFrame
to be created by the following code:Unfortunately:
This is due to
DataFrame
being hardcoded infrom_dict
: https://github.com/pydata/pandas/blob/master/pandas/core/frame.py#L905 .cls
variable should be used here.Note that
ClassExtendingDataFrame
is initialised using constructor, rather thanfrom_dict
method, correct object is created:However, operations as simple as slicing break this:
These are just the two examples I have noticed myself, but I am sure there could be more.
A thorough review of Hardcoded
DataFrame
constructors is needed to check if they could be replaced byself.__class__
orcls
instead.The text was updated successfully, but these errors were encountered: