Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "runtime_params" resolver to allow overriding of config with CLI params #3036

Merged
merged 14 commits into from
Sep 27, 2023
Merged
1 change: 1 addition & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

## Major features and improvements
* Allowed using of custom cookiecutter templates for creating pipelines with `--template` flag for `kedro pipeline create` or via `template/pipeline` folder.
* Allowed overriding of configuration keys with runtime parameters using the `runtime_params` resolver with `OmegaConfigLoader`.

## Bug fixes and other changes
* Updated dataset factories to resolve nested catalog config properly.
Expand Down
55 changes: 37 additions & 18 deletions docs/source/configuration/advanced_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,18 @@ The documentation on [configuration](./configuration_basics.md) describes how to
By default, Kedro is set up to use the [ConfigLoader](/kedro.config.ConfigLoader) class. Kedro also provides two additional configuration loaders with more advanced functionality: the [TemplatedConfigLoader](/kedro.config.TemplatedConfigLoader) and the [OmegaConfigLoader](/kedro.config.OmegaConfigLoader).
Each of these classes are alternatives for the default `ConfigLoader` and have different features. The following sections describe each of these classes and their specific functionality in more detail.

This page also contains a set of guidance for advanced configuration requirements of standard Kedro projects:

* [How to change which configuration files are loaded](#how-to-change-which-configuration-files-are-loaded)
* [How to ensure non default configuration files get loaded](#how-to-ensure-non-default-configuration-files-get-loaded)
* [How to bypass the configuration loading rules](#how-to-bypass-the-configuration-loading-rules)
* [How to use Jinja2 syntax in configuration](#how-to-use-jinja2-syntax-in-configuration)
* [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
* [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
* [How to override configuration with runtime parameters with the `OmegaConfigLoader`](#how-to-override-configuration-with-runtime-parameters-with-the-omegaconfigloader)
* [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
* [How to load credentials through environment variables with `OmegaConfigLoader`](#how-to-load-credentials-through-environment-variables)

## OmegaConfigLoader

[OmegaConf](https://omegaconf.readthedocs.io/) is a Python library designed to handle and manage settings. It serves as a YAML-based hierarchical system to organise configurations, which can be structured to accommodate various sources, allowing you to merge settings from multiple locations.
Expand All @@ -23,12 +35,6 @@ from kedro.config import OmegaConfigLoader # new import

CONFIG_LOADER_CLASS = OmegaConfigLoader
```
### Advanced `OmegaConfigLoader` features
Some advanced use cases of `OmegaConfigLoader` are listed below:
- [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
- [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
- [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
- [How to load credentials through environment variables](#how-to-load-credentials-through-environment-variables)

## TemplatedConfigLoader

Expand Down Expand Up @@ -127,16 +133,6 @@ If you specify both `globals_pattern` and `globals_dict` in `CONFIG_LOADER_ARGS`

## Advanced Kedro configuration

This section contains a set of guidance for advanced configuration requirements of standard Kedro projects:
* [How to change which configuration files are loaded](#how-to-change-which-configuration-files-are-loaded)
* [How to ensure non default configuration files get loaded](#how-to-ensure-non-default-configuration-files-get-loaded)
* [How to bypass the configuration loading rules](#how-to-bypass-the-configuration-loading-rules)
* [How to use Jinja2 syntax in configuration](#how-to-use-jinja2-syntax-in-configuration)
* [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
* [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
* [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
* [How to load credentials through environment variables](#how-to-load-credentials-through-environment-variables)

### How to change which configuration files are loaded
If you want to change the patterns that the configuration loader uses to find the files to load you need to set the `CONFIG_LOADER_ARGS` variable in [`src/<package_name>/settings.py`](../kedro_project_setup/settings.md).
For example, if your `parameters` files are using a `params` naming convention instead of `parameters` (e.g. `params.yml`) you need to update `CONFIG_LOADER_ARGS` as follows:
Expand Down Expand Up @@ -300,9 +296,32 @@ You can also provide a default value to be used in case the global variable does
```yaml
my_param: "${globals: nonexistent_global, 23}"
```
If there are duplicate keys in the globals files in your base and run time environments, the values in the run time environment
will overwrite the values in your base environment.
If there are duplicate keys in the globals files in your base and runtime environments, the values in the runtime environment
overwrite the values in your base environment.

### How to override configuration with runtime parameters with the `OmegaConfigLoader`

Kedro allows you to [specify runtime parameters for the `kedro run` command with the `--params` CLI option](parameters.md#how-to-specify-parameters-at-runtime). These runtime parameters
are added to the `KedroContext` and merged with parameters from the configuration files to be used in your project's pipelines and nodes. From Kedro `0.18.14`, you can use the
`runtime_params` resolver to indicate that you want to override values of certain keys in your configuration with runtime parameters provided through the CLI option.
This resolver can be used across different configuration types, such as parameters, catalog, and more, except for "globals".

Consider this `parameters.yml` file:
```yaml
model_options:
random_state: "${runtime_params:random}"
```
This will allow you to pass a runtime parameter named `random` through the CLI to specify the value of `model_options.random_state` in your project's parameters:
```bash
kedro run --params random=3
```
You can also specify a default value to be used in case the runtime parameter is not specified with the `kedro run` command. Consider this catalog entry:
```yaml
companies:
type: pandas.CSVDataSet
filepath: "${runtime_params:folder, 'data/01_raw'}/companies.csv"
```
If the `folder` parameter is not passed through the CLI `--params` option with `kedro run`, the default value `'data/01_raw/'` is used for the `filepath`.

### How to use resolvers in the `OmegaConfigLoader`
Instead of hard-coding values in your configuration files, you can also dynamically compute them using [`OmegaConf`'s
Expand Down
77 changes: 67 additions & 10 deletions kedro/config/omegaconf_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

import fsspec
from omegaconf import OmegaConf
from omegaconf.errors import InterpolationResolutionError
from omegaconf.errors import InterpolationResolutionError, UnsupportedInterpolationType
from omegaconf.resolvers import oc
from yaml.parser import ParserError
from yaml.scanner import ScannerError
Expand Down Expand Up @@ -141,8 +141,17 @@ def __init__( # noqa: too-many-arguments
env=env,
runtime_params=runtime_params,
)
try:
self._globals = self["globals"]
except MissingConfigException:
self._globals = {}

def __getitem__(self, key) -> dict[str, Any]:
def __setitem__(self, key, value):
if key == "globals":
self._globals = value
super().__setitem__(key, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a comment explaining why we treat global differently.


def __getitem__(self, key) -> dict[str, Any]: # noqa: PLR0912
"""Get configuration files by key, load and merge them, and
return them in the form of a config dictionary.

Expand All @@ -161,6 +170,10 @@ def __getitem__(self, key) -> dict[str, Any]:
"""
# Allow bypassing of loading config from patterns if a key and value have been set
# explicitly on the ``OmegaConfigLoader`` instance.

# Re-register runtime params resolver incase it was previously deactivated
self._register_runtime_params_resolver()

if key in self:
return super().__getitem__(key)

Expand All @@ -170,6 +183,10 @@ def __getitem__(self, key) -> dict[str, Any]:
)
patterns = [*self.config_patterns[key]]

if key == "globals":
# "runtime_params" resolver is not allowed in globals.
OmegaConf.clear_resolver("runtime_params")

read_environment_variables = key == "credentials"

processed_files: set[Path] = set()
Expand All @@ -178,9 +195,18 @@ def __getitem__(self, key) -> dict[str, Any]:
base_path = str(Path(self.conf_source) / self.base_env)
else:
base_path = str(Path(self._fs.ls("", detail=False)[-1]) / self.base_env)
base_config = self.load_and_merge_dir_config(
base_path, patterns, key, processed_files, read_environment_variables
)
try:
base_config = self.load_and_merge_dir_config(
base_path, patterns, key, processed_files, read_environment_variables
)
except UnsupportedInterpolationType as exc:
if "runtime_params" in str(exc):
raise UnsupportedInterpolationType(
"The `runtime_params:` resolver is not supported for globals."
)
else:
raise exc

config = base_config

# Load chosen env config
Expand All @@ -189,9 +215,18 @@ def __getitem__(self, key) -> dict[str, Any]:
env_path = str(Path(self.conf_source) / run_env)
else:
env_path = str(Path(self._fs.ls("", detail=False)[-1]) / run_env)
env_config = self.load_and_merge_dir_config(
env_path, patterns, key, processed_files, read_environment_variables
)
try:
env_config = self.load_and_merge_dir_config(
env_path, patterns, key, processed_files, read_environment_variables
)
except UnsupportedInterpolationType as exc:
if "runtime_params" in str(exc):
raise UnsupportedInterpolationType(
"The `runtime_params:` resolver is not supported for globals."
)
else:
raise exc

# Destructively merge the two env dirs. The chosen env will override base.
common_keys = config.keys() & env_config.keys()
if common_keys:
Expand All @@ -209,6 +244,7 @@ def __getitem__(self, key) -> dict[str, Any]:
f"No files of YAML or JSON format found in {base_path} or {env_path} matching"
f" the glob pattern(s): {[*self.config_patterns[key]]}"
)

return config

def __repr__(self): # pragma: no cover
Expand Down Expand Up @@ -297,6 +333,7 @@ def load_and_merge_dir_config( # noqa: too-many-arguments
return OmegaConf.to_container(
OmegaConf.merge(*aggregate_config, self.runtime_params), resolve=True
)

return {
k: v
for k, v in OmegaConf.to_container(
Expand All @@ -322,15 +359,22 @@ def _register_globals_resolver(self):
replace=True,
)

def _register_runtime_params_resolver(self):
OmegaConf.register_new_resolver(
"runtime_params",
self._get_runtime_value,
replace=True,
)

def _get_globals_value(self, variable, default_value=_NO_VALUE):
"""Return the globals values to the resolver"""
if variable.startswith("_"):
raise InterpolationResolutionError(
"Keys starting with '_' are not supported for globals."
)
global_omegaconf = OmegaConf.create(self["globals"])
globals_oc = OmegaConf.create(self._globals)
interpolated_value = OmegaConf.select(
global_omegaconf, variable, default=default_value
globals_oc, variable, default=default_value
)
if interpolated_value != _NO_VALUE:
return interpolated_value
Expand All @@ -339,6 +383,19 @@ def _get_globals_value(self, variable, default_value=_NO_VALUE):
f"Globals key '{variable}' not found and no default value provided."
)

def _get_runtime_value(self, variable, default_value=_NO_VALUE):
"""Return the runtime params values to the resolver"""
runtime_oc = OmegaConf.create(self.runtime_params)
interpolated_value = OmegaConf.select(
runtime_oc, variable, default=default_value
)
if interpolated_value != _NO_VALUE:
return interpolated_value
else:
raise InterpolationResolutionError(
f"Runtime parameter '{variable}' not found and no default value provided."
)

@staticmethod
def _register_new_resolvers(resolvers: dict[str, Callable]):
"""Register custom resolvers"""
Expand Down
Loading