-
Notifications
You must be signed in to change notification settings - Fork 61
pickling of frames without a tasklet #62
Comments
Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer): This problem looks kind-of "made up" on first sight, I need to say: Pickling of frames does not exist in Python, it is a Stackless feature Your pyheapdump module creates fake frames for non-Stackless as I |
Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):
That is indeed a possible point of view. But IMHO we add a feature, we should add it right. Pickling in general preserves the object graph and if a certain type fails to do so it is broken. In CPython I'm free to add a reduce function for frames. In Stackless I can't do it, because there is already a reduce function which I can't replace without breaking Stackless. Of course I can work around this issue in pyheapdump and I already did when I created this library. But being able to work around a bug is no compelling argument for not fixing the bug. Of course there might be other arguments against fixing this issue in 2.7-slp. One is stability. But even if we fix don't fix it in 2.7-slp we might want to fix it in 3.x-slp. |
Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):
I cannot find this definition. Do you have a reference? The "graph" is already there, represented by the list. It does not automatically stick
How that - via copyreg? Subclassing frames seems still not to work. You still cannot subclass frames, so how do you do it, by copyreg? But well, your code says that it is not portable and tries to push the limits, I tried to understand how pyheapdump/sPickle works, but only with little
Btw., I studied the patch in depth, and it seems pretty correct and cool. It makes quite complicated code Not a rejection, I'm just not sure if we should go this way, or better rewrite the pickling. |
Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):
https://stackless.readthedocs.org/en/latest/library/pickle.html#relationship-to-other-python-modules, near the end of the section: "The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure." In the case of Stackless, the cited sentence is no longer true, if the complex object is a frame with a non null f_back. This brings me to another possible simpler fix for the bug: we could remove frameobject_reduce() from copy_reg.dispatch_table. Then pickle.dumps(tasklet.frame) raises a PicklingError.
Yes, via copy_reg.pickle() or via subclassing pickle.Pickler. Subclassing frames is neither possible nor required.
I'm glad you had a look at the patch. I have still no clue, why the watchdog test fails, but I didn't study the recent watchdog changes yet. Of course it is possible to improve the comments and to refactor it. We have plenty of time and can discuss details at the conference. My motivation to fix these issues is to make Stackless as rock-solid as vanilla C-Python already is. If I demonstrate tasklet serialisation code to our customers, it really helps if the code is clean and free of clumsy workarounds. |
Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):
Ok I give in on that. It is true that this is not a full pickling, because that would need to work without a helper But I like your other idea much better:
So as I understand it, we now would neither add frameobject_reduce to copy_rec nor supply a _reduce_ method, This keeps the current behavior exactly the same and is completely compatible, because we now don't I am much in favor of this solution, because:
Actually it is arguable if modules should be pickleable, besides other things. We could enable the otherwise not exposed pickling just locally, in the context In any case, I am much in favor of your proposed change because it removes problems. And it still supports fiddling with frames for people who need it, there is just no |
Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis): Fine. I'll try to code another patch, that removes the ability to pickle frames. We can then compare both solutions. I have to think about the handling of trace-backs, which could be part of a frame (f_exc_traceback). And a generator has a frame reference too.
I don't understand your argument here. The pickle format (more exactly, the methods/functions referenced by a pickle and their signatures) are part of the API of the pickle mini-language. Within one minor version (e.g. 2.7, 2.7.1 ... 2.7.6) of Stackless we should not change this pickle API. We are however free to change the pickling API for tasklets, frames and code-objects between different minor versions, because the code objects, which are part of every frame, are not fully portable between different versions. See http://hg.python.org/stackless/file/40388ebb5aab/Lib/importlib/_bootstrap.py for the list of magic numbers and assorted changes of the interpreter. It is IMHO not possible to pickle a tasklet with one minor version of stackless and continue it with a different version. About pickling modules: modules are already pickleable: they are pickled by reference. unpickling simply imports the module. There is even a special pickle opcode called "GLOBAL" for this purpose. About pickling other types: if you pickle tasklets, you have to be able to serialise their frames. Now each frame has local variables and these variables must be pickleable too. It is therefore almost always necessary to extent the CPython pickler a little bit. (Don't forget, pickling was designed to be a format for "data" serialisation in the first place.) I don't think we should remove the pickling support for any other type. |
Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):
Then I had it wrongly in mind, something let me think you had long-living pickles including tasklets About pickling: We extended module pickling. At least that is incompatible. About pickling other types: Sure it is necessary to extend the pickler. But as said: This is only needed in the context The really compatible thing was our trick to "donate pickling" to CPython, on a sprint. Those things which And from a former reply:
I skipped the second part of this sentence, but it turns out to be contradicting the first part. I'm seeking to avoid incompatibilities, and that's why this thread made sense for me in the first place. To support your customers, I should have told you my rate before digging deeply into things. |
Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer): This was the first and only complaint about frame pickling. I changed it from "bug", "major" to "enhancement", "minor" because I see little Maybe the module in question should be upgraded to 3.x before considering |
Original comment by Kristján Valur Jónsson (Bitbucket: krisvale, GitHub: kristjanvalur): Only recently did I stumble upon this rather weird behaviour. There are special cases for f_back being null which represents a partially unpickled frame chain. It is indeed an anomaly because everywhere else in python, objects are pickled recursively. Only the tasklet pickler decides to pull the frame chain apart and pickle each frame individually, rather than let recursion run its course on f_back. Perhaps the reason was fear of recursion. Stackless already has recursion-safe pickling, (stack spilling) perhaps because recursion was found to be too deep when pickling tasklets... I'm guessing only. |
Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis): I just declined pull request #13. It might break existing Python code. If you pickle a taceback object, the pickler currently pickles only the frames referred by the tb_frame attribute of the chain of traceback-objects. This pull request changes the situation quite a bit. Now the pickler follows frame->f_back. Therefore the pickle of a traceback would additionally include all frames above the current frame (reachable via traceback->tb_frame->f_back). If one of these frames contains a local variable, that can't be pickled, pickling fails with exception pickle.PicklingError. Another problem is, that the additionally pickled frames could contain sensitive information or large information (i.e. a reference to a very large data structure). |
Originally reported by: Anselm Kruis (Bitbucket: akruis, GitHub: akruis)
Hi,
now that #61 is resolved, I can look after another long standing problem. If Stackless pickles a frame, it does not save or restore the f_back attribute. This is not a problem, if the frame is part of a taklet, because tasklet.setstate() sets f_back. But in every other case, current behaviour violates the premise that the unpickled object has the same internal structure as the original object.
Use Cases
trace backs
The most prominent use case for pickling frames (besides tasklets) are trace-backs. I assume that the internal structure of a trace-back is well known.
If you pickle/unpickle a trace back, the missing f_back has two consequences:
pyheapdump
pyheapdump is a library to support post mortem debugging using a dump file. It works similar to the well known UNIX core dumps, but the format of the pyheapdump files is based on pickle. pyheapdump uses an extended pickler (sPickle). For vanilla C-Python a frame gets (un)pickled as a fake-frame that has the same attributes as a frame but is really a plain Python object.
But for Stackless I have a problem. If I pickle frame-objects as a fake frame, I break (un)pickling of tasklets. And if I don't change the pickling of frames, trace-backs a broken.
How to fix this probelm
Obviously a fix needs to meet a few conditions:
For a long time I thought that it would be impossible to meet both conditions. Only recently I discovered a possibility to add f_back so to the frame-state that current versions of Stackless simply ignore this additional information. A fixed version would detect the presence of f_back and use it.
How? The (c)frame-state as returned by frameobject_reduce() or cframe_reduce() always contains a "tuple with nulls" structure which is created by slp_into_tuple_with_nulls() and processed by slp_from_tuple_with_nulls(). Fortunately slp_from_tuple_with_nulls() ignores/skips non integer items in the inner "nulls"-tuple. This is our chance to add additional information to the frame-state without breaking unpickling of the enriched state with the current code-base.
I created a preliminary patch to verify that it is indeed possible to fix this problem. The patch is meant as a starting point for a discussion. I'll create a pull request.
The text was updated successfully, but these errors were encountered: