-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SettingWithCopy dependence on reference count #14150
Comments
pls show an explicit example |
Examples (more context in my SO question):
Edit to my original post: I think it would also be useful to have a method, say |
you can just set: is_copy = None object dtypes always copy and are never views this while earning bizness is a mess because we don't have copy-on-write I don't think this is likely to be addressed in current pandas and will have to wait for pandas 2.0 that said if you want to whip up some tests that mimic current behavior we can put them in a special area so as to note behavior that should change |
Of course, I understand that CopyOnWrite is the only proper resolution. But pandas 2.0 is probably 3-6 years away. I'm not sure about the amount of effort to make the warning more comprehensive (e.g., as I suggested or otherwise); if it's not that big maybe worth it? After all, in the meantime this is the only protection from these bugs. |
you are misunderstanding |
you are certainly welcome to PR a fix for this - but it requires s non trivial effort |
Oh sorry, I didn't realize pandas 2.0 is not that far. I naively assumed 1.0-2.0 is going to take comparable time to 0-1.0. In that case, fixing SettingWithCopyWarning is not a high priority. |
yeah 1.x will be long term supported with very few |
@pkch FYI, I made some progress on setting up a PR for keeping track of children of DataFrames (#12036) -- something you may have seen it looks like, but worth flagging. I set it aside since there were aspirations of CopyonWrite getting implemented in 1.0 (#11970), though now appears that it's been pushed to 2.0 (wesm/pandas2#10). You may find some of the components useful if you decide to try and get something in before 2.0. |
As a follow-up to my question, I believe the current approach is that pandas assumes that the programmer intends assignment to a child DF/S to be propagated to the parent DF/S as long as the parent has a non-zero reference count.
The problems like the ones I described would be avoided if child DataFrame always warned on assignment regardless of the reference count of the parent object - UNLESS a
copy() method
orDataFrame
constructor was explicitly applied to the slice. (If desiredquery
and/orfilter
can be documented ascopy
methods as well.)(Making ref count incremented when the child is a view is not enough because the code that works normally might one day stop working - with SettingWithCopyWarning that but might be ignored if it appear on the client site - simply due to the change in the input data.)
If this approach is followed, the documentation can also be clarified as follows:
The text was updated successfully, but these errors were encountered: