Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs on difference between OmegaConf and OmegaConfigLoader #3352

Merged
merged 8 commits into from
Nov 30, 2023
25 changes: 25 additions & 0 deletions docs/source/configuration/advanced_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* [How to ensure non default configuration files get loaded](#how-to-ensure-non-default-configuration-files-get-loaded)
* [How to bypass the configuration loading rules](#how-to-bypass-the-configuration-loading-rules)
* [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
* [How to load a data catalog with templating in code?](#how-to-load-a-data-catalog-with-templating-in-code)
* [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
* [How to override configuration with runtime parameters with the `OmegaConfigLoader`](#how-to-override-configuration-with-runtime-parameters-with-the-omegaconfigloader)
* [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
Expand Down Expand Up @@ -133,6 +134,30 @@
#### Other configuration files
It's also possible to use variable interpolation in configuration files other than parameters and catalog, such as custom spark or mlflow configuration. This works in the same way as variable interpolation in parameter files. You can still use the underscore for the templated values if you want, but it's not mandatory like it is for catalog files.

### How to load a data catalog with templating in code?
If you want to directly load a data catalog that contains templating in code you can leverage the `OmegaConfigLoader`. Under the hood the `OmegaConfigLoader` will resolve any templates, so no further steps are required to load catalog entries properly.

Check warning on line 138 in docs/source/configuration/advanced_configuration.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/advanced_configuration.md#L138

[Kedro.words] Use 'use' instead of 'leverage'.
Raw output
{"message": "[Kedro.words] Use 'use' instead of 'leverage'.", "location": {"path": "docs/source/configuration/advanced_configuration.md", "range": {"start": {"line": 138, "column": 86}}}, "severity": "WARNING"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved
```yaml
# Example catalog with templating
companies:
type: ${_dataset_type}
filepath: data/01_raw/companies.csv

_dataset_type: pandas.CSVDataset
```

```python
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = OmegaConfigLoader(conf_source=conf_path)

conf_catalog = conf_loader["catalog"]
# conf_catalog["companies"]
# Will result in: {'type': 'pandas.CSVDataset', 'filepath': 'data/01_raw/companies.csv'}
```

### How to use global variables with the `OmegaConfigLoader`
From Kedro `0.18.13`, you can use variable interpolation in your configurations using "globals" with `OmegaConfigLoader`.
The benefit of using globals over regular variable interpolation is that the global variables are shared across different configuration types, such as catalog and parameters.
Expand Down
44 changes: 44 additions & 0 deletions docs/source/configuration/configuration_basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,28 @@

`OmegaConfigLoader` can load `YAML` and `JSON` files. Acceptable file extensions are `.yml`, `.yaml`, and `.json`. By default, any configuration files used by the config loaders in Kedro are `.yml` files.

### `OmegaConf` vs. Kedro's `OmegaConfigLoader`
`OmegaConf` is a configuration management library in Python that allows you to manage hierarchical configurations. On the other hand, Kedro's `OmegaConfigLoader` is a component within the Kedro framework that utilises `OmegaConf` for handling configurations.

Check warning on line 20 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L20

[Kedro.toowordy] 'On the other hand' is too wordy
Raw output
{"message": "[Kedro.toowordy] 'On the other hand' is too wordy", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 20, "column": 116}}}, "severity": "WARNING"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved
This means that when you work with `OmegaConfigLoader` in Kedro, you are leveraging the capabilities of `OmegaConf` without directly interacting with it.
merelcht marked this conversation as resolved.
Show resolved Hide resolved

`OmegaConfigLoader` in Kedro is designed to handle more complex configuration setups commonly used in Kedro projects. It automates the process of merging configuration files, such as those for catalogs, and takes into account different environments, making it convenient for managing configurations in a structured way.
merelcht marked this conversation as resolved.
Show resolved Hide resolved

When you need to load configurations manually, such as for exploration in a notebook, you have two options:
1. Use the `OmegaConfigLoader` class provided by Kedro.
2. Directly use the `OmegaConf` library.

If your use case involves loading only one configuration file and you don't have the complexity that Kedro's `OmegaConfigLoader` is designed to handle, it may be simpler to use `OmegaConf` directly.

Check notice on line 29 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L29

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 29, "column": 1}}}, "severity": "INFO"}

Check warning on line 29 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L29

[Kedro.weaselwords] 'only' is a weasel word!
Raw output
{"message": "[Kedro.weaselwords] 'only' is a weasel word!", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 29, "column": 35}}}, "severity": "WARNING"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved

```python
from omegaconf import OmegaConf

parameters = OmegaConf.load("/path/to/parameters.yml")
```

When your configuration files are more complex and contain credentials or templating Kedro's `OmegaConfigLoader` is better suited to load configuration, as described in more detail in [How to load a data catalog with credentials in code?](#how-to-load-a-data-catalog-with-credentials-in-code) and [How to load a data catalog with templating in code?](advanced_configuration.md#how-to-load-a-data-catalog-with-templating-in-code).

Check notice on line 37 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L37

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 37, "column": 1}}}, "severity": "INFO"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved

In summary, while both `OmegaConf` and Kedro's `OmegaConfigLoader` provide ways to manage configurations, the latter is specifically tailored for Kedro projects with a focus on handling more intricate configuration structures and environments. The choice between them depends on the complexity of your configuration needs and whether you are working within the context of the Kedro framework.

Check notice on line 39 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L39

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 39, "column": 1}}}, "severity": "INFO"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved

## Configuration source
The configuration source folder is [`conf`](../get_started/kedro_concepts.md#conf) by default. We recommend that you keep all configuration files in the default `conf` folder of a Kedro project.

Expand Down Expand Up @@ -86,6 +108,7 @@
* [How to change the configuration source folder at runtime](#how-to-change-the-configuration-source-folder-at-runtime)
* [How to read configuration from a compressed file](#how-to-read-configuration-from-a-compressed-file)
* [How to access configuration in code](#how-to-access-configuration-in-code)
* [How to load a data catalog with credentials in code?](#how-to-load-a-data-catalog-with-credentials-in-code)
* [How to specify additional configuration environments](#how-to-specify-additional-configuration-environments)
* [How to change the default overriding environment](#how-to-change-the-default-overriding-environment)
* [How to use only one configuration environment](#how-to-use-only-one-configuration-environment)
Expand Down Expand Up @@ -159,6 +182,27 @@
conf_catalog = conf_loader["catalog"]
```

### How to load a data catalog with credentials in code?
Assuming your project contains a catalog and credentials file each located in a `base` and `local` environment respectively, you can use the `OmegaConfigLoader` to load these configurations and then pass them on to a `DataCatalog` object to get access to the catalog entries with resolved credentials.

Check notice on line 186 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L186

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 186, "column": 1}}}, "severity": "INFO"}
merelcht marked this conversation as resolved.
Show resolved Hide resolved
merelcht marked this conversation as resolved.
Show resolved Hide resolved
```python
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from kedro.io import DataCatalog

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = OmegaConfigLoader(
conf_source=conf_path, base_env="base", default_run_env="local"
)

# These lines show how to access the catalog and credentials configurations.
conf_catalog = conf_loader["catalog"]
conf_credentials = conf_loader["credentials"]

# Fetch the catalog with resolved credentials from the configuration.
catalog = DataCatalog.from_config(catalog=conf_catalog, credentials=conf_credentials)
```

### How to specify additional configuration environments
In addition to the two built-in `local` and `base` configuration environments, you can create your own. Your project loads `conf/base/` as the bottom-level configuration environment but allows you to overwrite it with any other environments that you create, such as `conf/server/` or `conf/test/`. To use additional configuration environments, run the following command:

Expand Down