Skip to content

Latest commit

 

History

History
229 lines (156 loc) · 10.4 KB

index.md

File metadata and controls

229 lines (156 loc) · 10.4 KB

Logging

Kedro uses Python's logging library. Configuration is provided as a dictionary according to the Python logging configuration schema in Kedro's default logging configuration, as described below.

By default, Python only shows logging messages at level WARNING and above. Kedro's logging configuration specifies that INFO level messages from Kedro should also be emitted. This makes it easier to track the progress of your pipeline when you perform a kedro run.

Default logging configuration

Kedro's default logging configuration defines a handler called rich that uses the Rich logging handler to format messages. We also use the Rich traceback handler to render exceptions.

How to perform logging in your Kedro project

To add logging to your own code (e.g. in a node):

import logging

logger = logging.getLogger(__name__)
logger.warning("Issue warning")
logger.info("Send information")
logger.debug("Useful information for debugging")

You can use Rich's console markup in your logging calls:

logger.error("[bold red blink]Important error message![/]", extra={"markup": True})

How to customise Kedro logging

To customise logging in your Kedro project, you need to specify the path to a project-specific logging configuration file. Change the environment variable KEDRO_LOGGING_CONFIG to override the default logging configuration. Point the variable instead to your project-specific configuration, which we recommend you store inside the project'sconf folder, and name logging.yml.

For example, you can set KEDRO_LOGGING_CONFIG by typing the following into your terminal:

export KEDRO_LOGGING_CONFIG=<project_root>/conf/logging.yml

After setting the environment variable, any subsequent Kedro commands use the logging configuration file at the specified path.

If the `KEDRO_LOGGING_CONFIG` environment variable is not set, Kedro will use the [default logging configuration](https://github.com/kedro-org/kedro/blob/main/kedro/framework/project/default_logging.yml).

Change the verbosity of specific parts of Kedro

You can also customise logging at runtime and redefine the logging configuration provided in the logging.yml when using jupyter notebook. The example below demonstrates how you can change the logging level from default INFO to WARNING for the kedro.io.data_catalog component logger specifically, the logging for the rest of the components will remain unchanged. The same can be done for higher/lower-level components without affecting the top-level.

Add the following to a cell in your notebook:

import logging


logging.getLogger("kedro.io.data_catalog").setLevel(logging.WARNING)

Custom CONF_SOURCE with logging

When you customise the CONF_SOURCE setting in your Kedro project, it determines where Kedro looks for configuration files, including the logging configuration file. However, changing CONF_SOURCE does not automatically update the path to logging.yml. To use a custom location or filename for the logging configuration, you must explicitly set the KEDRO_LOGGING_CONFIG environment variable.

By default, Kedro looks for a file named logging.yml in the conf directory. If you move or rename your logging configuration file after changing CONF_SOURCE, specify the new path using the KEDRO_LOGGING_CONFIG environment variable:

export KEDRO_LOGGING_CONFIG=<project_root>/custom_config_folder/custom_logging_name.yml

Please note that adjusting CONF_SOURCE or renaming logging.yml without updating the logging configuration accordingly can lead to Kedro not locating the file, which will result in the default logging settings being used instead.

How to show DEBUG level messages

To see DEBUG level messages, change the level of logging in your project-specific logging configuration file (logging.yml). We provide a logging.yml template:

Click to expand the logging.yml template
version: 1

disable_existing_loggers: False

formatters:
  simple:
    format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"

handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: simple
    stream: ext://sys.stdout

  info_file_handler:
    class: logging.handlers.RotatingFileHandler
    level: INFO
    formatter: simple
    filename: info.log
    maxBytes: 10485760 # 10MB
    backupCount: 20
    encoding: utf8
    delay: True

  rich:
    class: kedro.logging.RichHandler
    rich_tracebacks: True
    # Advance options for customisation.
    # See https://docs.kedro.org/en/stable/logging/index.html#how-to-perform-logging-in-your-kedro-project
    # tracebacks_show_locals: False

loggers:
  kedro:
    level: INFO

  your_python_package:
    level: INFO

root:
  handlers: [rich]

You need to change the line:

loggers:
  kedro:
    level: INFO

  your_python_package:
-   level: INFO
+   level: DEBUG
The name of a logger corresponds to a key in the `loggers` section of the logging configuration file (e.g. `kedro`). See [Python's logging documentation](https://docs.python.org/3/library/logging.html#logger-objects) for more information.

By changing the level value to DEBUG for the desired logger (e.g. <your_python_package>), you will start seeing DEBUG level messages in the log output.

Advanced logging

In addition to the rich handler defined in Kedro's framework, we provide two additional handlers in the template.

  • console: show logs on standard output (typically your terminal screen) without any rich formatting
  • info_file_handler: write logs of level INFO and above to info.log

The following section illustrates some common examples of how to change your project's logging configuration.

How to customise the rich handler

Kedro's kedro.logging.RichHandler is a subclass of rich.logging.RichHandler and supports the same set of arguments. By default, rich_tracebacks is set to True to use rich to render exceptions. However, you can disable it by setting rich_tracebacks: False.

If you want to disable `rich`'s tracebacks, you must set `KEDRO_LOGGING_CONFIG` to point to your local config i.e. `conf/logging.yml`.

When rich_tracebacks is set to True, the configuration is propagated to rich.traceback.install. If an argument is compatible with rich.traceback.install, it will be passed to the traceback's settings.

For instance, you can enable the display of local variables inside logging.yml to aid with debugging.

  rich:
    class: kedro.logging.RichHandler
    rich_tracebacks: True
+   tracebacks_show_locals: True

A comprehensive list of available options can be found in the RichHandler documentation.

How to enable file-based logging

File-based logging in Python projects aids troubleshooting and debugging. It offers better visibility into application's behaviour and it's easy to search. However, it does not work well with read-only systems such as Databricks Repos.

To enable file-based logging, add info_file_handler in your root logger as follows in your conf/logging.yml as follows:

 root:
-  handlers: [rich]
+  handlers: [rich, info_file_handler]

By default it only tracks INFO level messages, but it can be configured to capture any level of logs.

How to use plain console logging

To use plain rather than rich logging, swap the rich handler for the console one as follows:

 root:
-  handlers: [rich]
+  handlers: [console]

How to enable rich logging in a dumb terminal

Rich detects whether your terminal is capable of displaying richly formatted messages. If your terminal is "dumb" then formatting is automatically stripped out so that the logs are just plain text. This is likely to happen if you perform kedro run on CI (e.g. GitHub Actions or CircleCI).

If you find that the default wrapping of the log messages is too narrow but do not wish to switch to using the console logger on CI then the simplest way to control the log message wrapping is through altering the COLUMNS and LINES environment variables. For example:

export COLUMNS=120 LINES=25
You must provide a value for both `COLUMNS` and `LINES` even if you only wish to change the width of the log message. Rich's default values for these variables are `COLUMNS=80` and `LINE=25`.

How to enable rich logging in Jupyter

Rich also formats the logs in JupyterLab and Jupyter Notebook. The size of the output console does not adapt to your window but can be controlled through the JUPYTER_COLUMNS and JUPYTER_LINES environment variables. The default values (115 and 100 respectively) should be suitable for most users, but if you require a different output console size then you should alter the values of JUPYTER_COLUMNS and JUPYTER_LINES.

How to use logging without the rich library

If you prefer not to have the rich library in your Kedro project, you have the option to uninstall it. However, it's important to note that versions of the cookiecutter library above 2.3 have a dependency on rich. You will need to downgrade cookiecutter to a version below 2.3 to have Kedro work without rich.

To uninstall the rich library, run:

pip uninstall rich

To downgrade cookiecutter to a version that does not require rich, you can specify a version below 2.3. For example:

pip install cookiecutter==2.2.0

These changes will affect the visual appearance and formatting of Kedro's logging, prompts, and the output of the kedro ipython command. While using a version of cookiecutter below 2.3, the appearance of the prompts will be plain even with rich installed.