openml registers handler on root logger #736

PGijsbers · 2019-07-11T22:04:27Z

openml registers a streamhandler with the root logger, because it uses logging.basicConfig. That functions registers a logging.StreamHandler with the root logger if it does not yet have a handler. This means that logging of other packages may automatically be processed by this streamhandler. An example to demonstrate:

import logging

if __name__ == '__main__':
    #  Commenting out the line below prevents `basicConfig` from registering a streamhandler
    #  logging.getLogger().addHandler(logging.NullHandler())
    
    log = logging.getLogger('mypackage')
    log.setLevel(logging.DEBUG)

    log.info("Message one")  # first call, no log handler registered
    import openml
    log.info("Message two")  # second call, log handler registered by openml

output:

[INFO] [14:47:46:mypackage] Message two

Due to the logging.basicConfig behavior, if a handler already is registered (such as is done in the commented out line), it will not add a logging.StreamHandler. So there is no output if the relevant line of code is commented out.

I also see a verbosity in the config defaults, but I don't see it used anywhere. Am I missing something?

I think adding a logging.StreamHandler to the root logger is undesirable behavior. Developers should have full autonomy over their logging handlers without resorting to adding logging.NullHandler. I think we should avoid using logging.basicConfig to set our streamhandler. Instead, we should register a streamhandler to an openml specific logger instead of the root (i.e. log = logging.getLogger('openml') or log = logging.getLogger(__name__)). We should make sure to take into account the verbosity level as defined in the config when doing so.

The text was updated successfully, but these errors were encountered:

mfeurer · 2019-07-12T15:03:33Z

Thanks for bringing that up. Actually, the package should have a good default logging behavior, but should allow flexible configuration and fully disabling it, too. I currently don't know the best way to achieve this unfortunately.

PGijsbers · 2019-07-12T21:17:58Z

What do you consider 'good default logging behavior'? For me it would be:

Have a package's root logger (e.g. logging.getLogger('openml'), not logging.getLogger()) configured at logging.DEBUG. This means any logging.Handler registered will by default pick up all messages, limited by the logging.Handler.level.
Have each module log to its own logger (i.e. logging.getLogger(__name__)).
As the name will be openml. ... any logging.Handler registered with the package root logger will also pick up on messages sent by the package root logger (it keeps a hierarchy based on name). But if we're alternatively only interested in just registering for messages from submodules, we can directly register a logging.Handler with their logger.
My preference would be that any logging.Handler registers only with the package root logger, and only if the openml config file specifies it (but this can be a default). By default, I would only consider using a logging.FileHandler to write the log to a file. Perhaps writing to a new file each day, and removing any files older than a week automatically (again exposing freedom for how long (if at all) to retain logs through the configuration file).

mfeurer · 2019-07-15T11:54:09Z

Have a package's root logger (e.g. logging.getLogger('openml'), not logging.getLogger()) configured at logging.DEBUG. This means any logging.Handler registered will by default pick up all messages, limited by the logging.Handler.level

I would not show debug messages by default, but I agree on the logger name.

Have each module log to its own logger (i.e. logging.getLogger(name)).
As the name will be openml. ... any logging.Handler registered with the package root logger will also pick up on messages sent by the package root logger (it keeps a hierarchy based on name). But if we're alternatively only interested in just registering for messages from submodules, we can directly register a logging.Handler with their logger.

Agreed.

My preference would be that any logging.Handler registers only with the package root logger, and only if the openml config file specifies it (but this can be a default). By default, I would only consider using a logging.FileHandler to write the log to a file. Perhaps writing to a new file each day, and removing any files older than a week automatically (again exposing freedom for how long (if at all) to retain logs through the configuration file).

I didn't think of logging to the openml directory by default, but that appears to be a great idea. The user could still configure the package to output to the command line.

PGijsbers · 2019-07-17T05:07:25Z

I would not show debug messages by default, but I agree on the logger name.

I'm not advocating showing (to console?) debug messages by default either. There's two levels to set, that of the logger and those of the logger.Handler. No messages would be shown if no handlers are set to the logger. A message is only processed if the level is met by both the log-object and the handler. By default no handlers would be set. To illustrate:

>>> import logging
>>> openml_log = logging.getLogger('openml')
>>> openml_log.setLevel(logging.DEBUG)
>>> openml_log.error("With the exception of error-level messages")
With the exception of error-level messages
>>> openml_log.info("No messages are written to console.")
>>> # if you want to change this default behavior, and have not even error messages
... # written to stdout, you can add a `logging.NullHandler()`.
... # Now we add a 'logging.StreamHandler' to actually write to stdout
...
>>> import sys
>>> sh = logging.StreamHandler(sys.stdout)
>>> sh.setLevel(logging.INFO)
>>> openml_log.addHandler(sh)
>>> openml_log.info("Now info messages are also written to stdout.")
Now info messages are also written to stdout.
>>> openml_log.debug("but not debug messages.")  # because `sh.level == logging.INFO`
>>> sh.setLevel(logging.DEBUG)
>>> openml_log.debug("Now even DEBUG messages are written to stdout.")
Now even DEBUG messages are written to stdout.
>>> openml_log.setLevel(logging.ERROR)
>>> openml_log.warning("By setting the log level of the log differently, handlers wont receive all messages.")  # because `openml_log.level == logging.ERROR`
>>> # Nothing written to stdout

By setting the log level of the log (not the handler!) to logging.DEBUG, we allow any handler to subscribe to any level of messages it is interested in.

I think the proposed changes will be relatively easy to implement. I don't have the time currently (when I do, I'll first finish the 'short names' PR). It's a low priority issue but if it stays open I can tackle it end of the month/in August.

mfeurer · 2019-07-17T13:36:35Z

Got it, that totally makes sense.

PGijsbers mentioned this issue Oct 30, 2019

Fix736 #861

Merged

PGijsbers closed this as completed in #861 Nov 5, 2019

mitar mentioned this issue Nov 14, 2019

Package should not configure logging without being asked to do so #885

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openml registers handler on root logger #736

openml registers handler on root logger #736

PGijsbers commented Jul 11, 2019 •

edited

Loading

mfeurer commented Jul 12, 2019

PGijsbers commented Jul 12, 2019

mfeurer commented Jul 15, 2019

PGijsbers commented Jul 17, 2019

mfeurer commented Jul 17, 2019

openml registers handler on root logger #736

openml registers handler on root logger #736

Comments

PGijsbers commented Jul 11, 2019 • edited Loading

mfeurer commented Jul 12, 2019

PGijsbers commented Jul 12, 2019

mfeurer commented Jul 15, 2019

PGijsbers commented Jul 17, 2019

mfeurer commented Jul 17, 2019

PGijsbers commented Jul 11, 2019 •

edited

Loading