Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmegaConfigLoader returns Dict instead of DictConfig, resolves runtime_params properly #2467

Merged
merged 25 commits into from
May 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
4565684
Merge branch 'main' of github.com:kedro-org/kedro
noklam Feb 10, 2023
67cff2d
Merge branch 'main' of github.com:kedro-org/kedro
noklam Feb 21, 2023
055124e
Merge branch 'main' of github.com:kedro-org/kedro
noklam Mar 22, 2023
842e83c
Merge branch 'main' of github.com:kedro-org/kedro
noklam Mar 23, 2023
d6d224a
Fix typehint
noklam Mar 24, 2023
78c1ae0
test push
noklam Mar 27, 2023
5b48003
Merge branch 'main' of github.com:kedro-org/kedro into fix-omegaconfig
noklam Mar 29, 2023
5b0bff8
POC of fix to solve the runtime param resolution problem
noklam Mar 29, 2023
c864783
Merge branch 'main' of github.com:kedro-org/kedro into fix-omegaconfig
noklam May 2, 2023
2c9da40
Fix OmegaConfigLoadaer - resolve runtime_params early
noklam May 2, 2023
f578c39
Delegate the intialization of runtime_params to AbstractConfigLoader
noklam May 2, 2023
de98e15
Add test for interpolated value and globals
noklam May 2, 2023
4c0571d
add more test and linting
noklam May 2, 2023
f1c1ec2
refactor
noklam May 2, 2023
acbde3e
update release note
noklam May 2, 2023
87a51b6
Apply comments and refactor the test
noklam May 4, 2023
a1b5f0c
Update RELEASE.md
noklam May 4, 2023
29f62ff
Remove unnecessary condition when len(config) == 1
noklam May 5, 2023
f610010
Update release note
noklam May 5, 2023
5799792
Merge branch 'main' into fix-omegaconfig
noklam May 5, 2023
78c2371
Merge branch 'main' into fix-omegaconfig
noklam May 10, 2023
e24fbfe
Merge branch 'fix-omegaconfig' of github.com:kedro-org/kedro into fix…
noklam May 10, 2023
0f7911f
Remove unused import
noklam May 10, 2023
e7d6f62
remove a line from coverage temporarily
noklam May 11, 2023
a4d9893
Merge branch 'main' into fix-omegaconfig
noklam May 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions RELEASE.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd put all this as bug fix rather than major improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me if this is part of the official support feature. My reasoning go as below:

We never fixed the TemplatedConfigLoader as we promised we need some more thinking for 0.19. If this is considered a bug fix, I don't see why we can't fix this for TemplatedConfigLoader in 0.18.x and requires users to sub-class it instead.

In fact, TemplatedConfigLoader doesn't suffer from the residual problem so it can be safely used for all configurations. There is nothing wrong with merging the runtime_params as this is exactly what we are doing now.

#1782 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fair comment I think for the "interpolated parameters" point, but I still think the dict vs. DictConfig point should be under bug fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antonymilne I agree with this and I will move it to bug fix.

Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,11 @@
# Upcoming Release 0.18.9

## Major features and improvements
* `kedro run --params` now updates interpolated parameters correctly when using `OmegaConfigLoader`.

## Bug fixes and other changes
* `OmegaConfigLoader` will return a `dict` instead of `DictConfig`.

## Breaking changes to the API
* `kedro package` does not produce `.egg` files anymore, and now relies exclusively on `.whl` files.
## Upcoming deprecations for Kedro 0.19.0
Expand Down
2 changes: 1 addition & 1 deletion kedro/config/abstract_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def __init__(
super().__init__()
self.conf_source = conf_source
self.env = env
self.runtime_params = runtime_params
self.runtime_params = runtime_params or {}


class BadConfigException(Exception):
Expand Down
16 changes: 11 additions & 5 deletions kedro/config/omegaconf_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ def __getitem__(self, key) -> dict[str, Any]:
else:
base_path = str(Path(self._fs.ls("", detail=False)[-1]) / self.base_env)
base_config = self.load_and_merge_dir_config(
base_path, patterns, read_environment_variables
base_path, patterns, key, read_environment_variables
)
config = base_config

Expand All @@ -179,7 +179,7 @@ def __getitem__(self, key) -> dict[str, Any]:
else:
env_path = str(Path(self._fs.ls("", detail=False)[-1]) / run_env)
env_config = self.load_and_merge_dir_config(
env_path, patterns, read_environment_variables
env_path, patterns, key, read_environment_variables
)

# Destructively merge the two env dirs. The chosen env will override base.
Expand Down Expand Up @@ -211,6 +211,7 @@ def load_and_merge_dir_config(
self,
conf_path: str,
patterns: Iterable[str],
key: str,
read_environment_variables: bool | None = False,
) -> dict[str, Any]:
"""Recursively load and merge all configuration files in a directory using OmegaConf,
Expand All @@ -219,6 +220,7 @@ def load_and_merge_dir_config(
Args:
conf_path: Path to configuration directory.
patterns: List of glob patterns to match the filenames against.
key: Key of the configuration type to fetch.
read_environment_variables: Whether to resolve environment variables.

Raises:
Expand Down Expand Up @@ -275,9 +277,13 @@ def load_and_merge_dir_config(

if not aggregate_config:
return {}
if len(aggregate_config) == 1:
return list(aggregate_config)[0]
return dict(OmegaConf.merge(*aggregate_config))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the flow here is a little unclear (too many different return cases) and, more importantly, unless I'm missing something there's actually a bug: if you have a single element in aggregate_config (i.e. one parameters file) then runtime_params don't get merged in.

This would be solved if we removed lines 276-279. I know @merelcht said there was some reason to pick out the len == 1 case before but I don't see what this could be now that we always want to do to_container(resolve=True).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be keen on removing it, unless we have a specific failing case. I try remove line 278-279 and all test case passed.

Should we separate this into another issue? Removing Line 276-277 will cause a few tests fail, I suspect it's also link to #2556 and some refactoring may be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy for those lines to be removed if everything is still working fine. I don't remember why I added it, so really should have left a comment 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove line 278-279 here since as I understand it at the moment they are not just unnecessary but following this change will actually be causing a bug (would be good if you could check this): if you just have a single parameters.yml file, the runtime_params don't get merged in.

Lines 276-277 don't actually cause a bug I think, so if removing them makes some tests fail and it seems related to #2556 then let's leave those here for now and try to remove them in that PR.

if key == "parameters":
# Merge with runtime parameters only for "parameters"
return OmegaConf.to_container(
OmegaConf.merge(*aggregate_config, self.runtime_params), resolve=True
)
return OmegaConf.to_container(OmegaConf.merge(*aggregate_config), resolve=True)

def _is_valid_config_path(self, path):
"""Check if given path is a file path and file type is yaml or json."""
Expand Down
2 changes: 1 addition & 1 deletion kedro/framework/session/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ def create( # pylint: disable=too-many-arguments
def _get_logging_config(self) -> dict[str, Any]:
logging_config = self._get_config_loader()["logging"]
if isinstance(logging_config, omegaconf.DictConfig):
logging_config = OmegaConf.to_container(logging_config)
logging_config = OmegaConf.to_container(logging_config) # pragma: no cover
# turn relative paths in logging config into absolute path
# before initialising loggers
logging_config = _convert_paths_to_absolute_posix(
Expand Down
64 changes: 60 additions & 4 deletions tests/config/test_omegaconf_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,14 +70,30 @@ def local_config(tmp_path):

@pytest.fixture
def create_config_dir(tmp_path, base_config, local_config):
proj_catalog = tmp_path / _BASE_ENV / "catalog.yml"
base_catalog = tmp_path / _BASE_ENV / "catalog.yml"
base_logging = tmp_path / _BASE_ENV / "logging.yml"
base_spark = tmp_path / _BASE_ENV / "spark.yml"
base_catalog = tmp_path / _BASE_ENV / "catalog.yml"

local_catalog = tmp_path / _DEFAULT_RUN_ENV / "catalog.yml"

parameters = tmp_path / _BASE_ENV / "parameters.json"
project_parameters = {"param1": 1, "param2": 2}
base_parameters = {"param1": 1, "param2": 2, "interpolated_param": "${test_env}"}
base_global_parameters = {"test_env": "base"}
local_global_parameters = {"test_env": "local"}

_write_yaml(proj_catalog, base_config)
_write_yaml(base_catalog, base_config)
_write_yaml(local_catalog, local_config)
_write_json(parameters, project_parameters)

# Empty Config
_write_yaml(base_logging, {"version": 1})
_write_yaml(base_spark, {"dummy": 1})

_write_json(parameters, base_parameters)
_write_json(tmp_path / _BASE_ENV / "parameters_global.json", base_global_parameters)
_write_json(
tmp_path / _DEFAULT_RUN_ENV / "parameters_global.json", local_global_parameters
)


@pytest.fixture
Expand Down Expand Up @@ -531,3 +547,43 @@ def zipdir(path, ziph):
conf = OmegaConfigLoader(conf_source=f"{tmp_path}/Python.zip")
catalog = conf["catalog"]
assert catalog["trains"]["type"] == "MemoryDataSet"

@use_config_dir
def test_variable_interpolation_with_correct_env(self, tmp_path):
"""Make sure the parameters is interpolated with the correct environment"""
conf = OmegaConfigLoader(str(tmp_path))
params = conf["parameters"]
# Making sure it is not override by local/parameters_global.yml
assert params["interpolated_param"] == "base"

@use_config_dir
def test_runtime_params_override_interpolated_value(self, tmp_path):
"""Make sure interpolated value is updated correctly with runtime_params"""
conf = OmegaConfigLoader(str(tmp_path), runtime_params={"test_env": "dummy"})
params = conf["parameters"]
assert params["interpolated_param"] == "dummy"

@use_config_dir
@use_credentials_env_variable_yml
def test_runtime_params_not_propogate_non_parameters_config(self, tmp_path):
"""Make sure `catalog`, `credentials`, `logging` or any config other than
`parameters` are not updated by `runtime_params`."""
# https://github.com/kedro-org/kedro/pull/2467
key = "test_env"
runtime_params = {key: "dummy"}
conf = OmegaConfigLoader(
str(tmp_path),
config_patterns={"spark": ["spark*", "spark*/**", "**/spark*"]},
runtime_params=runtime_params,
)
parameters = conf["parameters"]
catalog = conf["catalog"]
credentials = conf["credentials"]
logging = conf["logging"]
spark = conf["spark"]

assert key in parameters
assert key not in catalog
assert key not in credentials
assert key not in logging
assert key not in spark