Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #94: allow pickling Logger objects #96

Merged
merged 4 commits into from
May 31, 2017

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented May 31, 2017

No description provided.

@pitrou pitrou mentioned this pull request May 31, 2017
@codecov-io
Copy link

codecov-io commented May 31, 2017

Codecov Report

Merging #96 into master will increase coverage by 0.3%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #96     +/-   ##
=========================================
+ Coverage   79.88%   80.18%   +0.3%     
=========================================
  Files           2        2             
  Lines         522      530      +8     
  Branches      109      110      +1     
=========================================
+ Hits          417      425      +8     
  Misses         75       75             
  Partials       30       30
Impacted Files Coverage Δ
cloudpickle/cloudpickle.py 80.07% <100%> (+0.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91eae7b...26f91d7. Read the comment docs.


dumped = cloudpickle.dumps(logger)

code = """if 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use textwrap.dedent here to fix the indentation instead of the if.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

textwrap.dedent is a bit more annoying to use (you have to make sure the first line has the same indent as the other ones, which this idiom doesn't require).

@pitrou
Copy link
Member Author

pitrou commented May 31, 2017

Interesting: the dispatch dict doesn't support old-style classes, which is why the test fails on 2.6.

@mrocklin
Copy link
Contributor

So, my experience when I run into the issue of cloudpickle not being able to serialize a logger is that cloudpickle is actually trying to serialize everything. The solution in these cases is almost always to find out why. It's probably still correct for cloudpickle to learn how to serialize logger objects, but we may want to ask ourselves why this is occurring.

@pitrou
Copy link
Member Author

pitrou commented May 31, 2017

It's probably still correct for cloudpickle to learn how to serialize logger objects, but we may want to ask ourselves why this is occurring.

It's occurring simply because Logger objects don't have a custom __reduce__ method, so pickle tries to reconstruct them structurally (which entails also reconstructing the handlers attached to them, which is a slippery slope as often those have I/O objects attached to them).

@pitrou
Copy link
Member Author

pitrou commented May 31, 2017

Note reconstructing by calling getLogger(name) is really the right thing to do, as loggers really are lazily-instantiated named singletons. Their configuration needn't be serialized and may well be different in different processes or runs.

@pitrou
Copy link
Member Author

pitrou commented May 31, 2017

Are there any other comments before this get merged?

@pitrou pitrou merged commit 5eefe28 into cloudpipe:master May 31, 2017
@pitrou pitrou deleted the pickle_loggers branch May 31, 2017 20:08
ghost pushed a commit to dbtsai/spark that referenced this pull request Aug 22, 2017
## What changes were proposed in this pull request?

Based on apache#18282 by rgbkrk this PR attempts to update to the current released cloudpickle and minimize the difference between Spark cloudpickle and "stock" cloud pickle with the goal of eventually using the stock cloud pickle.

Some notable changes:
* Import submodules accessed by pickled functions (cloudpipe/cloudpickle#80)
* Support recursive functions inside closures (cloudpipe/cloudpickle#89, cloudpipe/cloudpickle#90)
* Fix ResourceWarnings and DeprecationWarnings (cloudpipe/cloudpickle#88)
* Assume modules with __file__ attribute are not dynamic (cloudpipe/cloudpickle#85)
* Make cloudpickle Python 3.6 compatible (cloudpipe/cloudpickle#72)
* Allow pickling of builtin methods (cloudpipe/cloudpickle#57)
* Add ability to pickle dynamically created modules (cloudpipe/cloudpickle#52)
* Support method descriptor (cloudpipe/cloudpickle#46)
* No more pickling of closed files, was broken on Python 3 (cloudpipe/cloudpickle#32)
* ** Remove non-standard __transient__check (cloudpipe/cloudpickle#110)** -- while we don't use this internally, and have no tests or documentation for its use, downstream code may use __transient__, although it has never been part of the API, if we merge this we should include a note about this in the release notes.
* Support for pickling loggers (yay!) (cloudpipe/cloudpickle#96)
* BUG: Fix crash when pickling dynamic class cycles. (cloudpipe/cloudpickle#102)

## How was this patch tested?

Existing PySpark unit tests + the unit tests from the cloudpickle project on their own.

Author: Holden Karau <holden@us.ibm.com>
Author: Kyle Kelley <rgbkrk@gmail.com>

Closes apache#18734 from holdenk/holden-rgbkrk-cloudpickle-upgrades.
andrewdhicks added a commit to opendatacube/datacube-core that referenced this pull request Sep 5, 2017
jeremyh pushed a commit to opendatacube/datacube-core that referenced this pull request Sep 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants