-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH/PERF: copy=True keyword in methods where copy=False can improve perf #48141
Comments
On the PR I said:
On second thought, this is not fully true. We could deprecate the |
ok i agree that this could be a can of worms and we have one change to do this right so will propose
|
I've been assuming that, since we have existing methods with copy=True keywords, that we will end up making a decorator to deprecate those if/when CoW makes them redundant. If so, the marginal cost (on the dev side) to deprecating extra methods would be low.
That's fair. My thought going in was a) we an offer improved performance now, while such a deprecation is both future and not-assured, b) all else equal, an explicit copy is a better API than an implicit one (CoW). If we do add the keyword now(ish), we could also document that there is a decent chance of an upcoming change. |
@jorisvandenbossche what if we did the following if/when CoW becomes the default: keep a copy keyword but have the default be lib.no_default, so |
Yeah, based on my current understanding, that would break the CoW logic. Or at least it would introduce hard to predict behaviour. For example, you can set the We might want to add an advanced feature to actually do something like this, at some point if there is demand for that. But if we do so, I think it should be 1) new API (so you don't accidentally get this because you already did In general for the
On the dev side yes, but on the user side we would still be advising people to use
Yes, but that's for |
@jbrockmendel would you have time to revert #48043, #47934, #47932? I was attempting to revert but got lost in all the merge conflicts as I think |
I'll give it a go this afternoon |
Correct, its to give users the option to get the more performant behavior (including internally, see a couple usages in #48117). In particular I have in mind use cases with method chaining where you might do something like
Yah this is a tough one. If/when CoW is enabled by default, do you expect there will still be an option to disable it via pd.options? If so, I expect we'd want to keep the copy kwd available for this mode. |
i think we should wait to avoid confusion with CoW then make a single change for 2.0 (or even wait until cow is default) |
Sorry for the slow follow-up here.
I would personally also prefer to revert those, but we would need to indeed revert also the deprecation of inplace. The same arguments as discussed above apply to those methods. I opened #48417 for |
CoW becoming the default/only behavior seems increasingly likely. Closing. |
xref #48117, #47993, #48043, #47934, #47932
We have a number of methods that currently make copies but don't have to, mainly methods that revolve around index/columns while leaving the data untouched. e.g.
DataFrame.droplevel
. By adding a keywordcopy=True
to these methods, we give users the choice to avoid making a copy by settingcopy=False
, thereby improving performance.@jorisvandenbossche makes a reasonable point #48117 (review) that adding this may interact with potential copy/view changes in upcoming versions.
Besides
droplevel
anddrop
, for which PRs are already open, the other methods I'm looking at arereset_index
,replace
, andfillna
(there may be others I haven't found). In all three of those cases, the idea would be to add copy and deprecate inplace (xref #16529).The text was updated successfully, but these errors were encountered: