-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy on write using weakrefs #11500
Copy on write using weakrefs #11500
Conversation
First of all -- this is really great! Thank you for pushing on this. A few specific design issues:
Finally, we are definitely going to need some comprehensive performance checks. How does the ASV benchmark suite compare after this change? What types of uses used to be fast but now are slow? These will need to be carefully documented. |
you need to have all of the original tests in here |
Thanks both for the feedback!
But you are correct I need to add a
@jreback Would you mind expanding on your comment / clarifying a little? I don't think I've deleted any tests. Do you mean that you think this seems promising enough it's worth digging into the testing modules to see what works and what fails with these modifications and edit the failing tests? Happy to do so, just wanted to see if people felt this path was sufficiently promising to warrant that work. |
|
you have essentially turned off all of the old setting with copy tests |
@jreback are you talking about runtime tests in core files like @shoyer Got it. Revising now. Also realized one places this breaks, so need to add another tweak to implementation. Will ping when done. One general question: Is there a better way to know if something is a view than
|
97f914b
to
72692f5
Compare
@shoyer @jreback Performance checks @shoyer Think your suggestions all integrated. Still probably some excess copying because @jreback I'm afraid I'm still a little unclear on what you're asking for. I know you do a lot on these sites so brevity is your friend, but could you please be a little more expansive / point to a specific example?
|
the perf issues need to be addressed |
return self._get_item_cache(item) | ||
result = self._get_item_cache(item) | ||
|
||
if isinstance(item, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only str
?
@nickeubank you changed code but you never fixed tests. e.g. You will need to carefully address each of the existing tests, as these define the current API. These must all pass and/or change as appropriate. This ensures that the API is preserved. You can certainly change things, but these must be well-defined and explictly communicated. Generally having green PR's gives much more confidence. This is the point of tests, to try to map out the API as much as possible. To cover edge cases and to provide consistency among operations. This of course is not guaranteed, but helps a lot. Examining code only goes so far. Python is far to dynamic a language to just rely on code inspection for correctness. |
@jreback Ah, ok -- thanks so much for the longer explanation! Will do! |
a988404
to
8e4daac
Compare
5098923
to
aa337ef
Compare
tweak so addition of new columnsdoesn't cause copying improved tests, removed copy option from merge
aa337ef
to
1814085
Compare
@jreback thanks again for explanation on tests -- now see wisdom. Updated tests, and fixed a few issues that came up. Open Issues
Other notes I removed the Performance
cc: @shoyer |
You are still removing LOTS of tests. Don't remove any! You need to individually go thru and fix/correct them. You need to assert that the entire existing API is unchanged (which is defined via the tests). Clearly if somethign is asserting that The entire point of the test suite is to avoid changing behavior accidently. |
On 2: sorry, wasn't clear. It isn't attribute value I want to pickle; it's the existence of the attribute.
Do I need to rebuild somewhere? Will work on tests! |
you need to define it as
then when you set it in an instance it is overriden only in that instance. This is a feature of python. |
1dc1d98
to
d62b9fa
Compare
@jreback @shoyer Quick update: Most things working well and I think will work eventually. Currently need to address:
However, at the moment I need to take a hiatus on this to work on a draft of my dissertation, so I'm going to have to set this aside for ~4 weeks. Haven't dropped -- I still really believe in the importance of this and that it will work -- but taking up more time than I have at the moment (especially given how much background research I'm having to learn about python to do all this...). |
@nickeubank np, thanks for the effort so far! |
xref #11970 as this might take a bit, feel free to reopen if updated |
@nickeubank hmm, I can't seem to reopen this, very odd. |
because |
k will reopen as new pr |
Working model of copy-on-write. Aims to close #10954, alternative to #10973, extension of #11207.
Copy-on-Write Behavior:
Setting on child doesn't affect parent, but still uses views when can for efficiency
Setting on parent doesn't affect child
One exception is dictionary-like access, where views are preserved
(as suggested here: #10954 (comment) )
Safe for views of views
Chained indexing behavior now consistent
Will always fail unless first class is a dictionary-like call for a single series (since that will always be a view)
Will fail if first call not dict-like
Will always succeed if first call dict-like
To Do:
@jreback
@TomAugspurger
@shoyer
@JanSchulz
cc @ellisonbg
cc @CarstVaartjes