Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix736 #861

Merged
merged 10 commits into from
Nov 5, 2019
Merged

Fix736 #861

merged 10 commits into from
Nov 5, 2019

Conversation

PGijsbers
Copy link
Collaborator

Fixes #736.

  • All openml package logging is logged through the 'openml' logger (this fixes side-effects of working with the root logger that are demonstrated in the issue and code below).
  • All messages at logging.WARNING and up are printed to console (by default). This is in line with old behavior.
  • A file log of messages at logging.DEBUG and up (i.e. all of them) are logged to a file in the cache directory (by default). A new log file is created if the old log file reaches a size of 1 MB. At most one current log file and and old log file will be kept on disk at a time (i.e. at most 2 MB of openml-python logs are kept).
  • The log level defaults can be changed through the configuration file, for both channels individually. A new file_verbosity is introduced alongside the regular verbosity.

The difference can be seen when running the following lines of code:

import logging

if __name__ == '__main__':
    log = logging.getLogger('mypackage')
    log.setLevel(logging.DEBUG)

    log.info("Message one")
    import openml
    log.info("Message two") 

    from openml.flows import get_flow
    f = get_flow(123)
    f.description = ''
    f._to_dict()  # logging.warning will be called due to empty description.

    from openml.datasets import get_dataset
    iris = get_dataset(61)  # since I have it in cache, a logging.debug statement is called.

in current develop we will see:

[INFO] [14:11:03:mypackage] Message two
[DEBUG] [14:11:03:mypackage] Message three
[WARNING] [14:11:03:openml.flows.flow] Flow 'weka.LinearRegression's empty description

printed in console, with no (semi-)permanent of the log is kept automatically.

With this PR, the console output becomes:

[WARNING] [14:12:42:openml.flows.flow] Flow 'weka.LinearRegression's empty description

and in my cache directory I find a file 'openml_python.log' with the following content:

[WARNING] [14:12:59:openml.flows.flow] Flow 'weka.LinearRegression's empty description
[DEBUG] [14:12:59:openml.datasets.dataset] Data pickle file already exists and is up to date.

I don't know how to unit test this, if at all.


A note on examples. I noticed some examples use logging (run setup and the fanova example). The generated pages do not show any logging output, so I was wondering what the purpose for the logging is. Perhaps @janvanrijn can help me here, as he seems to be the author.

examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
openml/config.py Outdated Show resolved Hide resolved
openml/config.py Outdated Show resolved Hide resolved
@mfeurer
Copy link
Collaborator

mfeurer commented Oct 31, 2019

The generated pages do not show any logging output, so I was wondering what the purpose for the logging is. Perhaps @janvanrijn can help me here, as he seems to be the author.

I guess we should move to print then. Could you please change that?

@mfeurer
Copy link
Collaborator

mfeurer commented Oct 31, 2019

This looks good besides my (small comments). Could you please also add this to the changelog?

@codecov-io
Copy link

codecov-io commented Nov 1, 2019

Codecov Report

Merging #861 into develop will decrease coverage by 0.04%.
The diff coverage is 82.92%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #861      +/-   ##
===========================================
- Coverage    88.43%   88.38%   -0.05%     
===========================================
  Files           37       37              
  Lines         4272     4298      +26     
===========================================
+ Hits          3778     3799      +21     
- Misses         494      499       +5
Impacted Files Coverage Δ
openml/datasets/dataset.py 87.03% <100%> (ø) ⬆️
openml/extensions/sklearn/extension.py 91.32% <70%> (+0.01%) ⬆️
openml/config.py 90.47% <86.2%> (+0.47%) ⬆️
openml/_api_calls.py 82.14% <0%> (-3.58%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a13100...f22f951. Read the comment docs.

openml/config.py Outdated Show resolved Hide resolved
@mfeurer
Copy link
Collaborator

mfeurer commented Nov 4, 2019

I know it wasn't part of my original requests, but what do you think about having an example in which we show how one can easily change the log level? That might be good for folks who do ML, but do not care too much about how logging works in python.

@PGijsbers PGijsbers merged commit 34d54d9 into develop Nov 5, 2019
@PGijsbers PGijsbers deleted the Fix736 branch November 5, 2019 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

openml registers handler on root logger
3 participants