Skip to content
This repository was archived by the owner on Feb 13, 2025. It is now read-only.

pickling of frames without a tasklet #62

Closed
ghost opened this issue May 25, 2014 · 12 comments
Closed

pickling of frames without a tasklet #62

ghost opened this issue May 25, 2014 · 12 comments

Comments

@ghost
Copy link

ghost commented May 25, 2014

Originally reported by: Anselm Kruis (Bitbucket: akruis, GitHub: akruis)


Hi,

now that #61 is resolved, I can look after another long standing problem. If Stackless pickles a frame, it does not save or restore the f_back attribute. This is not a problem, if the frame is part of a taklet, because tasklet.setstate() sets f_back. But in every other case, current behaviour violates the premise that the unpickled object has the same internal structure as the original object.

Use Cases

trace backs

The most prominent use case for pickling frames (besides tasklets) are trace-backs. I assume that the internal structure of a trace-back is well known.
If you pickle/unpickle a trace back, the missing f_back has two consequences:

  1. A (post mortem) debugger can't analyse/display the trace-back. This could be fixed by setting the frame.f_back attributes in tb_setstate().
  2. The frame objects above the frame, where the exception was caught, are not available.

pyheapdump

pyheapdump is a library to support post mortem debugging using a dump file. It works similar to the well known UNIX core dumps, but the format of the pyheapdump files is based on pickle. pyheapdump uses an extended pickler (sPickle). For vanilla C-Python a frame gets (un)pickled as a fake-frame that has the same attributes as a frame but is really a plain Python object.

But for Stackless I have a problem. If I pickle frame-objects as a fake frame, I break (un)pickling of tasklets. And if I don't change the pickling of frames, trace-backs a broken.

How to fix this probelm

Obviously a fix needs to meet a few conditions:

  1. It must not break anything, especially a pickle created by a current version of Stackless must stay readable and vice versa.
  2. We must add the f_back frame to the state-part of the result of frameobject_reduce()

For a long time I thought that it would be impossible to meet both conditions. Only recently I discovered a possibility to add f_back so to the frame-state that current versions of Stackless simply ignore this additional information. A fixed version would detect the presence of f_back and use it.

How? The (c)frame-state as returned by frameobject_reduce() or cframe_reduce() always contains a "tuple with nulls" structure which is created by slp_into_tuple_with_nulls() and processed by slp_from_tuple_with_nulls(). Fortunately slp_from_tuple_with_nulls() ignores/skips non integer items in the inner "nulls"-tuple. This is our chance to add additional information to the frame-state without breaking unpickling of the enriched state with the current code-base.

I created a preliminary patch to verify that it is indeed possible to fix this problem. The patch is meant as a starting point for a discussion. I'll create a pull request.


@ghost
Copy link
Author

ghost commented May 25, 2014

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):


I just created pull request #13.

@ghost
Copy link
Author

ghost commented May 25, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


This problem looks kind-of "made up" on first sight, I need to say:

Pickling of frames does not exist in Python, it is a Stackless feature
that only exists to support pickling of tasklets, so far.

Your pyheapdump module creates fake frames for non-Stackless as I
understand. Isn't it so that your module creates this problem
and should be changed in the first place, instead of "fixing" Stackless?

@ghost
Copy link
Author

ghost commented May 26, 2014

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):


Pickling of frames does not exist in Python, it is a Stackless feature that only exists to support pickling of tasklets, so far.

That is indeed a possible point of view. But IMHO we add a feature, we should add it right. Pickling in general preserves the object graph and if a certain type fails to do so it is broken.

In CPython I'm free to add a reduce function for frames. In Stackless I can't do it, because there is already a reduce function which I can't replace without breaking Stackless.

Of course I can work around this issue in pyheapdump and I already did when I created this library. But being able to work around a bug is no compelling argument for not fixing the bug.

Of course there might be other arguments against fixing this issue in 2.7-slp. One is stability. But even if we fix don't fix it in 2.7-slp we might want to fix it in 3.x-slp.

@ghost
Copy link
Author

ghost commented May 26, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


Pickling in general preserves the object graph and if a certain type fails to do so it is broken.

I cannot find this definition. Do you have a reference?

The "graph" is already there, represented by the list. It does not automatically stick
it all together, right. Complicated enough for my brain.

In CPython I'm free to add a reduce function for frames.

How that - via copyreg? Subclassing frames seems still not to work.

You still cannot subclass frames, so how do you do it, by copyreg?
Actually this is IMHO a side effect - copyreg is most probably not intentionally allowed
to do that, and we should file a bug ;-) (kidding)

But well, your code says that it is not portable and tries to push the limits,
so any trick is allowed. But if that gives a problem with Stackless,
I would recommend to not insist in a bug. You want to push a feature.

I tried to understand how pyheapdump/sPickle works, but only with little
success. How much code would go away if we did your patch? More than
the patch itself?

But being able to work around a bug is no compelling argument for not fixing the bug.

Btw., I studied the patch in depth, and it seems pretty correct and cool. It makes quite complicated code
just a bit longer and even more complicated and tricky, so it could become nicer if it can be broken
into smaller functions.

Not a rejection, I'm just not sure if we should go this way, or better rewrite the pickling.
Is this urgent for the conference?

@ghost
Copy link
Author

ghost commented May 27, 2014

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):


Pickling in general preserves the object graph and if a certain type fails to do so it is broken.

I cannot find this definition. Do you have a reference?

https://stackless.readthedocs.org/en/latest/library/pickle.html#relationship-to-other-python-modules, near the end of the section: "The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure."

In the case of Stackless, the cited sentence is no longer true, if the complex object is a frame with a non null f_back.

This brings me to another possible simpler fix for the bug: we could remove frameobject_reduce() from copy_reg.dispatch_table. Then pickle.dumps(tasklet.frame) raises a PicklingError.
Now we use a fairly simple trick to make sure that pickle.dumps(tasklet) still creates exactly the same pickle code as it does now: we change tasklet_reduce to return a list of intermediary "FrameReducer"-objects instead of the list of frames. A FrameReducer object would be a temporary Python object, that encapsulates a frame and has a custom method __reduce__(). FrameReducer.__reduce__(f) returns frameobject_reduce(f). I used this trick in the sPickle library a few times.

In CPython I'm free to add a reduce function for frames.
How that - via copyreg? Subclassing frames seems still not to work.

Yes, via copy_reg.pickle() or via subclassing pickle.Pickler. Subclassing frames is neither possible nor required.

Btw., I studied the patch in depth, and it seems pretty correct and cool. It makes quite complicated code just a bit longer and even more complicated and tricky, so it could become nicer if it can be broken into smaller functions.

I'm glad you had a look at the patch. I have still no clue, why the watchdog test fails, but I didn't study the recent watchdog changes yet. Of course it is possible to improve the comments and to refactor it. We have plenty of time and can discuss details at the conference.

My motivation to fix these issues is to make Stackless as rock-solid as vanilla C-Python already is. If I demonstrate tasklet serialisation code to our customers, it really helps if the code is clean and free of clumsy workarounds.

@ghost
Copy link
Author

ghost commented May 28, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


https://stackless.readthedocs.org/en/latest/library/pickle.html#relationship-to-other-python-modules
In the case of Stackless, the cited sentence is no longer true, if the complex object is a frame with a non null f_back.

Ok I give in on that. It is true that this is not a full pickling, because that would need to work without a helper
from outside. For the purpose of Stackless, it was enough, but ignored that definition.

But I like your other idea much better:

A FrameReducer object would be a temporary Python object, that encapsulates a frame and has a custom method
_reduce_()

So as I understand it, we now would neither add frameobject_reduce to copy_rec nor supply a _reduce_ method,
but remove the pickling capability completely from the interface. The code will not be invoked directly via the
pickling protocol, but via FrameReducer objects which are created during tasklet reduction.

This keeps the current behavior exactly the same and is completely compatible, because we now don't
litter the copy_reg with an incompatible function. It reads "tasklets can be pickled, frames not".
I think that was what I really was after, and now it becomes right.

I am much in favor of this solution, because:

  • the code is then decoupled from other constraints: you demand compatibility over years with your
    pickled stuff. I had headaches because of that...

  • it is a step towards even closer compatibility with CPython.

Actually it is arguable if modules should be pickleable, besides other things.
What we want to support is pickling of program state, nothing else. We don't
want to change Python otherwise at all. Does it make sense to go further
in that direction?

We could enable the otherwise not exposed pickling just locally, in the context
of pickling a tasklet or channel, or by providing a pickler that has to come from
the stackless module or something.

In any case, I am much in favor of your proposed change because it removes problems.

And it still supports fiddling with frames for people who need it, there is just no
public interface via copy_reg, but we can publish any building block via the stackless.py
module.

@ghost
Copy link
Author

ghost commented May 28, 2014

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):


Fine. I'll try to code another patch, that removes the ability to pickle frames. We can then compare both solutions. I have to think about the handling of trace-backs, which could be part of a frame (f_exc_traceback). And a generator has a frame reference too.

the code is then decoupled from other constraints: you demand compatibility over years with your pickled stuff. I had headaches because of that...

I don't understand your argument here. The pickle format (more exactly, the methods/functions referenced by a pickle and their signatures) are part of the API of the pickle mini-language. Within one minor version (e.g. 2.7, 2.7.1 ... 2.7.6) of Stackless we should not change this pickle API. We are however free to change the pickling API for tasklets, frames and code-objects between different minor versions, because the code objects, which are part of every frame, are not fully portable between different versions. See http://hg.python.org/stackless/file/40388ebb5aab/Lib/importlib/_bootstrap.py for the list of magic numbers and assorted changes of the interpreter. It is IMHO not possible to pickle a tasklet with one minor version of stackless and continue it with a different version.

About pickling modules: modules are already pickleable: they are pickled by reference. unpickling simply imports the module. There is even a special pickle opcode called "GLOBAL" for this purpose.

About pickling other types: if you pickle tasklets, you have to be able to serialise their frames. Now each frame has local variables and these variables must be pickleable too. It is therefore almost always necessary to extent the CPython pickler a little bit. (Don't forget, pickling was designed to be a format for "data" serialisation in the first place.) I don't think we should remove the pickling support for any other type.

@ghost
Copy link
Author

ghost commented May 29, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


We are however free to change the pickling API for tasklets, frames and code-objects between different minor versions, because the code objects, which are part of every frame, are not fully portable between different versions.

Then I had it wrongly in mind, something let me think you had long-living pickles including tasklets
that should survive pretty long time. This was especially after reading the patch, which emphasized it's
effort (not expected before to be possible) in making it compatible with older versions.

About pickling: We extended module pickling. At least that is incompatible.

About pickling other types: Sure it is necessary to extend the pickler. But as said: This is only needed in the context
of a tasklet. It is surely possible to restrict this ability to only exist during a tasklet pickling is still in progress.

The really compatible thing was our trick to "donate pickling" to CPython, on a sprint. Those things which
landed in CPython are absolutely compatible. I just wanted to point out that the existence of pickling
for other types, differences in module pickling etc. can be worked around as well.
It might become an issue that does not need to be.

And from a former reply:

My motivation to fix these issues is to make Stackless as rock-solid as vanilla C-Python already is. If I demonstrate tasklet serialisation code to our customers, it really helps if the code is clean and free of clumsy workarounds.

I skipped the second part of this sentence, but it turns out to be contradicting the first part.
The solidness of stackless exists and has nothing to do with needing work-arounds.
Therefore it turns out that we are seeking for slightly different things here:

I'm seeking to avoid incompatibilities, and that's why this thread made sense for me in the first place.

To support your customers, I should have told you my rate before digging deeply into things.
Ah, and that was the pressure to insist in a "bug", so I won't realize that :-))

@ghost
Copy link
Author

ghost commented May 29, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


This was the first and only complaint about frame pickling.

I changed it from "bug", "major" to "enhancement", "minor" because I see little
benefit and no urgency in a change right now.

Maybe the module in question should be upgraded to 3.x before considering
any change to a working stackless.

@ghost
Copy link
Author

ghost commented Jun 1, 2014

Original comment by Kristján Valur Jónsson (Bitbucket: krisvale, GitHub: kristjanvalur):


Only recently did I stumble upon this rather weird behaviour. There are special cases for f_back being null which represents a partially unpickled frame chain. It is indeed an anomaly because everywhere else in python, objects are pickled recursively. Only the tasklet pickler decides to pull the frame chain apart and pickle each frame individually, rather than let recursion run its course on f_back. Perhaps the reason was fear of recursion. Stackless already has recursion-safe pickling, (stack spilling) perhaps because recursion was found to be too deep when pickling tasklets... I'm guessing only.

@ghost
Copy link
Author

ghost commented Jun 1, 2014

Original comment by Christian Tismer (Bitbucket: ctismer, GitHub: ctismer):


This was indeed implemented before recursion-safe pickling.

The subject was tasklet pickling, not frames, so I'm fine if frames are not picklable per se.

@ghost
Copy link
Author

ghost commented Sep 4, 2016

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):


I just declined pull request #13. It might break existing Python code.

If you pickle a taceback object, the pickler currently pickles only the frames referred by the tb_frame attribute of the chain of traceback-objects. This pull request changes the situation quite a bit. Now the pickler follows frame->f_back. Therefore the pickle of a traceback would additionally include all frames above the current frame (reachable via traceback->tb_frame->f_back). If one of these frames contains a local variable, that can't be pickled, pickling fails with exception pickle.PicklingError. Another problem is, that the additionally pickled frames could contain sensitive information or large information (i.e. a reference to a very large data structure).

@ghost ghost closed this as completed Sep 24, 2017
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

0 participants