-
Notifications
You must be signed in to change notification settings - Fork 39
Copy on write for views #10
Comments
As nice as copy-on-write would be, it's not strictly necessary in pandas 2.0 because we can choose our own consistent rules for copying once we divorce our storage from NumPy. For example, we could say:
Given that we plan to ditch the I'm sure there are a few use cases for view based slicing of DataFrame rows, but these are quite niche in comparison to selecting columns, and in my opinion, the unpredictability it introduces into the data model is not worth the trouble. Copy on write for column views (and eventually, maybe row slicing) would still be nice in making pandas more intuitive, but could possibly wait until a later 2.1 or 3.0 release (supposing we're doing semantic versioning). |
I agree COW isn't a strict necessity for the 1 -> 2 transition. I think it's worth keeping in mind during the development process as there's a number of things we can do to make adding it later easier or more difficult. Step 1 is keeping track of parent-child relationships in a lightweight way, and we can permit mutation to start in accordance with current behavior |
See discussion in pandas-dev/pandas#11500 |
I've expressed my views on COW in pretty extensive detail elsewhere (#10954), so I'll save everyone the trouble of repeating them all here, but in short: any behavior that's consistent and easy to understand is fine by me! Have we abandoned trying to get this in before v1.0? |
It's probably not too likely, since it would be an API change that would take a little time to fully understand the impact. If anyone has other thoughts (separate from the behavior of C-O-W) on this please chime in |
A notable benefit of copy-on-write is that operations like |
I will work on a full document for this to get the conversation started, but this can be the placeholder for our discussion about COW
The text was updated successfully, but these errors were encountered: