Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmegaConfigLoader returns Dict instead of DictConfig, resolves runtime_params properly #2467

Merged
merged 25 commits into from
May 11, 2023

Conversation

noklam
Copy link
Contributor

@noklam noklam commented Mar 24, 2023

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Description

Close #2458, #2466

This PR changes a lot how OmegaConfigLoader behavior although not many lines of code are changed.

return dict(OmegaConf.merge(*aggregate_config))

The main changes are the resolving order and what's return by the config_loader

  1. Lazily evalution -> Eager resolution as Dictionary

  2. The merge with runtime_params is pushed to OmegaConfigLoader instead of KedroContext, this ensure that CLI params are updated properly

  3. will ensure that the OmegaConfigLoader comply with AbstractConfigLoader and 2. will ensure that overriding parameters with CLI work as expected.

Note: This will only make parameters work, more work needs to be done to generalise it to all configurations. #2531

Development notes

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Mar 24, 2023

Need some more time to study, but for answering the 1st question. I don't think this is the correct way to do it, but currently, cross-env interpolation is blocked because of the dict cast (see more in the description) @MatthiasRoels

image

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@deepyaman deepyaman changed the title [Draft] - investigating isseus with OmegaConf [Draft] - investigating issues with OmegaConf Mar 29, 2023
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Mar 29, 2023

Background

#2458
@merelcht @AntonyMilneQB

The root cause of the issue has to do with the resolution order with OmegaConf, this is roughly what happen behind the scene:

  1. config_loader["parameter"] - it needs to return a Dictionary
  2. load_and_merge_dir_config (OmegaConfigLoader) - this is where the merging happen, ideally we need to use `OmegaConf.to_container(..., resolve=True) here
  3. The result of 2 will merge context.extra_params recursively

In fact, it's not so different with TemplatedConfigLoader, the resolution order is the same:

  1. Return a resolved dictionary - merge global dict with the Jinja template value
  2. Result of 1 further merge with context.extra_params recursively

So it won't work too, the only thing works is a CustomTemplatedConfigLoader, see #1782, #1527 is something that we never solve.

  1. Inject extra_params into self._config_mapping early
  2. return resolved dictionary
  3. merge 2 with context.extra_params

Can we use the same tricks for OmegaConfigLoader?
YES!! It works fine and we can still interpolate nested value (which isn't possible before). This will also works for catalog.yml, since it's generic to any config. The follow-up question will be, why do we need the nested update in context.params?

More details on how Current OmegaConfigLoader behave

When user do kedro run --params global:val, it will update the context.extra_params. In order to make the extra_params update all the reference, we need to keep it as a OmegaConf.DictConfig, because resolver has to be the last step. It also create the leaky abstraction problem because the behavior is not supposed to be ConfigLoader specific.

There are 2 options for step 2

A) Convert it to a Dictionary using OmegaConf.to_container(..., resolve=True)
B) Keep it as a OmegaConf.DictConfig

Option A seem to be the right thing to do, as it is consistent with what AbstractConfigLoader requires. However, overriding with params= will not work anymore because all the reference to ${global} is gone once the resolve=True happen.

Option B will make this works, but it creates other problems too. It is a leaky abstraction, as a result it only works with parameters, and you need special treatment to other things (e.g. logging). It also create a problem that parameter no longer return a Dict but a DictConfig

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam
Copy link
Contributor Author

noklam commented Mar 29, 2023

5b0bff8 Would solve all these problem

@merelcht
Copy link
Member

merelcht commented Apr 3, 2023

5b0bff8 Would solve all these problem

Just to clarify something. In the earlier comment you mentioned:

There are 2 options for step 2
A) Convert it to a Dictionary using OmegaConf.to_container(..., resolve=True)
B) Keep it as a OmegaConf.DictConfig

Option A seem to be the right thing to do, as it is consistent with what AbstractConfigLoader requires. However, overriding with params= will not work anymore because all the reference to ${global} is gone once the resolve=True happen.

So I'm a bit confused as to 5b0bff8 solving all problems. Will it or will it not allow overriding with params= ?

@antonymilne
Copy link
Contributor

antonymilne commented Apr 3, 2023

Great work @noklam! I had reached similar conclusions to you also. In my mind there are just two different problems here:

  1. whether and how to resolve the omegaconf DictConfig on this line
  2. how to handle extra_params

On 1: both me (#1965 (comment)) and @MatthiasRoels (#2458) independently reached the same conclusion here: the current code looks wrong and the correct, non-leaky thing to do here is your option A above where you do to_container(resolve=True). I imagine the reason this wasn't done originally in #2176 is just that @merelcht didn't spot that you could set resolve=True in the to_container call? I'm sure this would solve the 3rd problem you list as "potentially solved" in your above comment also.

Now the 2nd problem of extra_params is a bit trickier. Here I previously thought the "correct" flow is the one that I suggested in #1965 (comment). This is somewhat unsatisfactory for two reasons though:

  1. as I said there, it feels a bit circular/redundant to convert these objects to dicts, then back into dictconfig and then back into dict again. But in order to achieve something that's non-leaky I think that's what we need to do
  2. something I didn't realise at the time I wrote that but you noticed with your global example: if extra_params are injected after to_container(resolve=True) then it's too late. As you also observed with TemplatedConfigLoader this actually isn't a new problem, but it would be nice if it did work

So what is the ideal solution here? I think exactly what you propose: since we already have runtime_params available in the configloader, let's merge them in before the call to to_container(resolve=True). I think there are a few potential objections here, but IMO none should stop us from making this change:

  1. what is the point of the update in context.params? The original intention for this would have been for the interactive workflow where context.params can be changed on the fly without needing to do %reload_kedro. But as we've discussed before, this functionality is basically unknown and pointless, and I'd be very happy to remove it and just resolve everything eagerly instead
  2. it means that --params can be used to inject variables into any config file, not just parameters. e.g. you could inject variables into your catalog through --params. This goes against the original intention of --params (that they are just for overriding parameters.yml) but is functionality that is sometimes called for (as in How to pass extra params to config loader since registration moved to settings.py in 0.18.x #1527) and so in my mind this is actually an improvement. If we really wanted to limit this functionality then we could put some conditional into the config loader itself so that the merging is only done when parameters are called for (rather than catalog etc.), but I don't like this
  3. if and when we move the merging behaviour out of context.params entirely then any other configloaders would also need to perform the merging behaviour inside the config loader. This would include the existing ConfigLoader and TemplatedConfigLoader if they remain. But I think in future OmegaConfLoader should be the only one shipped with kedro, so then there's no need to worry about this. Anyone writing their own custom config loader would need to add it into their implementation though

Overall I think there's possibly a current objection (maybe @idanov would say this?) that the config loader should not be responsible for merging in runtime parameters. Or maybe --params wouldn't be the right name for the CLI flag any more if it's more general than that. But personally I think this would overall be an improvement and I don't see any better solution anyway.

So here's what I think the ideal final state would look like, without worrying right now about how to implement it or what's a breaking change etc.

  • --params are processed with _split_params that does OmegaConf.from_dotlist and then to_container(resolve=False) to turn into a runtime_params dictionary (not dictconfig)
  • pass runtime_params into configloader as you do here
  • do to_container(resolve=True) inside omegaconfig loader including runtime_params as you do here
  • ship just omegaconfigloader with kedro
  • remove the merge from context.params entirely
  • consider changing the name of --params if it's going to be used for more than just parameters

@@ -275,7 +277,9 @@ def load_and_merge_dir_config(
return {}
if len(aggregate_config) == 1:
return list(aggregate_config)[0]
return dict(OmegaConf.merge(*aggregate_config))
return OmegaConf.to_container(
OmegaConf.merge(*aggregate_config, self.runtime_params), resolve=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this definitely work when self.runtime_params is a dictionary and not a DictConfig? i.e. is it ok to not do OmegaConf.create(runtime_params) here? Looking through omegaconf source I'm pretty sure this is fine (only the first argument of OmegaConf.merge needs to be DictConfig) but please do check this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with the source code, merge can take any primitive type I think it's fine here.

    def merge(
        *configs: Union[
            DictConfig,
            ListConfig,
            Dict[DictKeyType, Any],
            List[Any],
            Tuple[Any, ...],
            Any,
        ]

Comment on lines 132 to 133
if not self.runtime_params:
self.runtime_params = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably nicer to do runtime_params=runtime_params or {} above instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better AbstractConfigLoader should actually be the class responsible for this defaulting behaviour, i.e. it should do self.runtime_params = runtime_params or {}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with this.

@noklam
Copy link
Contributor Author

noklam commented Apr 3, 2023

@AntonyMilneQB Maybe slightly off topic. #2481 but in my mind this is the more correct way to do in long term, but should be discussed separately as I think we need a temporary solution.

@antonymilne
Copy link
Contributor

Related to all this, we also need to think about this bit of code:

        if len(aggregate_config) == 1:
            return list(aggregate_config)[0]

As it stands if we do 5b0bff8 then it will still return an unresolved DictConfig if there's only one file in aggregate_config. I don't actually know what this special len == 1 case is for - the code should work even when len(aggregate_config) == 1 without any special treatment? @merelcht @noklam

# Works fine, so no need for `len(aggregate_config) == 1` case
OmegaConf.merge(*[OmegaConf.create({1: 2})])

# Doesn't work fine, hence need for `if not aggregate_config` case
OmegaConf.merge(*[])

@merelcht
Copy link
Member

merelcht commented Apr 5, 2023

I don't exactly remember right now why I did this

if len(aggregate_config) == 1:
return list(aggregate_config)[0]

but there was definitely a good reason. There was something about just 1 config file not working the same as multiple.

@noklam
Copy link
Contributor Author

noklam commented Apr 5, 2023

@AntonyMilneQB I had the same comment before, I haven't really tested it so I am not sure here.

@noklam noklam changed the title [Draft] - investigating issues with OmegaConf OmegaConfigLoader returns Dict instead of DictConfig, resolves runtime_params properly Apr 19, 2023
@noklam noklam self-assigned this May 2, 2023
noklam added 5 commits May 2, 2023 13:58
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
noklam added 2 commits May 2, 2023 16:30
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam noklam marked this pull request as ready for review May 3, 2023 11:06
@noklam noklam requested a review from idanov as a code owner May 3, 2023 11:06
@noklam noklam requested review from antonymilne, merelcht and ankatiyar and removed request for antonymilne and idanov May 3, 2023 11:06
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for getting to the bottom of this ⭐

One minor comment: in the tests it might be better to not refer to "global" but instead just a random parameter name, like test_split or whatever. Now it seems like this functionality is specifically for "globals" like what we have in TemplatedConfigLoader, but it's actually different.

RELEASE.md Outdated
@@ -11,6 +11,9 @@
# Upcoming Release 0.18.9

## Major features and improvements
* `OmegaConfigLoader` will return a `dict` instead of `DictConfig`.
* `kedro run --params` now update interpolated parameters correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also just for the OmegaConfigLoader right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, I will make changes to this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change templated_param -> interpolated_param to align with the OmegaConf terminology.

I have changed the global name, but I think we need to come up with a better name since we are calling it global with OmegaConfigLoader, i.e.parameters_global.yml is what suggested in the doc.

kedro/config/omegaconf_config.py Outdated Show resolved Hide resolved

@use_config_dir
def test_runtime_params_not_propogate_non_parameters_config(self, tmp_path):
"""Make sure `catalog`, `credetials` won't get updated by runtime_params"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Make sure `catalog`, `credetials` won't get updated by runtime_params"""
"""Make sure `catalog`, `credentials` won't get updated by runtime_params"""

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nearly there but I think there's a bug with it unless I'm missing something: what happens if you have a single parameters.yml file and want to use runtime_params?

As part of a separate PR I think we should revert #2378 (but keep test_no_DictConfig_in_store as it's a good regression test) and #2176.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd put all this as bug fix rather than major improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me if this is part of the official support feature. My reasoning go as below:

We never fixed the TemplatedConfigLoader as we promised we need some more thinking for 0.19. If this is considered a bug fix, I don't see why we can't fix this for TemplatedConfigLoader in 0.18.x and requires users to sub-class it instead.

In fact, TemplatedConfigLoader doesn't suffer from the residual problem so it can be safely used for all configurations. There is nothing wrong with merging the runtime_params as this is exactly what we are doing now.

#1782 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fair comment I think for the "interpolated parameters" point, but I still think the dict vs. DictConfig point should be under bug fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antonymilne I agree with this and I will move it to bug fix.

RELEASE.md Outdated Show resolved Hide resolved
@@ -275,7 +277,13 @@ def load_and_merge_dir_config(
return {}
if len(aggregate_config) == 1:
return list(aggregate_config)[0]
return dict(OmegaConf.merge(*aggregate_config))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the flow here is a little unclear (too many different return cases) and, more importantly, unless I'm missing something there's actually a bug: if you have a single element in aggregate_config (i.e. one parameters file) then runtime_params don't get merged in.

This would be solved if we removed lines 276-279. I know @merelcht said there was some reason to pick out the len == 1 case before but I don't see what this could be now that we always want to do to_container(resolve=True).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be keen on removing it, unless we have a specific failing case. I try remove line 278-279 and all test case passed.

Should we separate this into another issue? Removing Line 276-277 will cause a few tests fail, I suspect it's also link to #2556 and some refactoring may be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy for those lines to be removed if everything is still working fine. I don't remember why I added it, so really should have left a comment 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove line 278-279 here since as I understand it at the moment they are not just unnecessary but following this change will actually be causing a bug (would be good if you could check this): if you just have a single parameters.yml file, the runtime_params don't get merged in.

Lines 276-277 don't actually cause a bug I think, so if removing them makes some tests fail and it seems related to #2556 then let's leave those here for now and try to remove them in that PR.

Co-authored-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>
@noklam
Copy link
Contributor Author

noklam commented May 4, 2023

As part of a separate PR I think we should revert #2378 (but keep test_no_DictConfig_in_store as it's a good regression test) and #2176.

Agree with this I can open an issue, it will be redundant.

Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing it! Do take a look at my most recent comments but basically I think this is good to go 🙂 Would be nice to follow up with #2556 and the reversions I mentioned soon though just to get it all cleared up.

noklam added 4 commits May 5, 2023 15:30
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam noklam enabled auto-merge (squash) May 10, 2023 17:13
noklam added 4 commits May 10, 2023 23:44
…-omegaconfig

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
@noklam noklam merged commit 0b5fc97 into main May 11, 2023
@noklam noklam deleted the fix-omegaconfig branch May 11, 2023 11:27
noklam added a commit that referenced this pull request Jun 7, 2023
* Remove unnecessary files and subheaders in documentation (#2563)

Tidying up.

* Leverage PEP-585 (#2540)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>

* [READY] Bring deployment docs up-to-date and add new pages for additional targets (#2557)

Ready at last. Still some conversation to be had about how we reflect "confidence" in the plugins we describe (what version they were last tested against) but the text is ready and better than what's there currently, so let's go!

* Remove mentions to `.egg` and stop producing them from `kedro package` (#2568)

* Stop producing `.egg` files from `kedro package`

Fix gh-2273.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Upgrade import-linter

Fix gh-2570.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Partially remove reliance on setup.py for micropackaging

Incomplete effort, see gh-2414.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Revert "Partially remove reliance on setup.py for micropackaging"

This reverts commit fd3efa8.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* OmegaConfigLoader returns `Dict` instead of `DictConfig`, resolves runtime_params properly (#2467)

* Fix typehint

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* test push

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* POC of fix to solve the runtime param resolution problem

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix OmegaConfigLoadaer - resolve runtime_params early

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Delegate the intialization of runtime_params to AbstractConfigLoader

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Add test for interpolated value and globals

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* add more test and linting

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* refactor

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* update release note

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Apply comments and refactor the test

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Update RELEASE.md

Co-authored-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>

---------

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Co-authored-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>

* Revert special logic handling DictConfig (#2582)

* Partially revert #2378 as OCL no longer returning DictConfig

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Revert #2176 with special logging handling

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* remove unused import

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Re-title and modify introduction to databricks testing guide in contribution section (#2579)

* modify databricks testing guide for contributors

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/contribution/development_for_databricks.md

Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>

* Update development_for_databricks.md

---------

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>

* Quick fix .gitpod.yml (#2585)

* `OmegaConfigLoader` should not show `MissingConfigException` when the file is empty (#2584)

* Allow empty config files

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Remove condition for empty config

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert "Remove condition for empty config"

This reverts commit 7961155.

* Update release notes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add unit test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Proposal of updated settings.py (#2587)

* Proposal of updated settings.py

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Update settings after comment

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Collaborative Experiment Tracking Docs  (#2589)

* Add metadata attribute to `kedro.io` datasets (#2537)

* Add metadata attribute to kedro.io datasets

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update specs in pipeline hooks docstrings (#2598)

* Update pipeline hooks docstrings

Signed-off-by: Tomas Van Pottelbergh <github@tomasvanpottelbergh.com>

* Acknowledge community contribution

---------

Signed-off-by: Tomas Van Pottelbergh <github@tomasvanpottelbergh.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Miscellaneous hooks improvements (#2596)

* Add debug level logging for hook execution order

Fix gh-1942.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Clarify role of setuptools in hooks

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Remove mentions to outdated and unused `pkg_resources`

Fix gh-2391.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Adapt example hook to spec signature

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Grammar and style fixes in docs

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Revert "Adapt example hook to spec signature"

This reverts commit ac1b495.

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Minor modification to titles of deployment docs to replace "deprecated" with "legacy" (#2606)

* FIX: Typo in markdown for kedro notebooks docs (#2610)

* FIX: Typo in markdown for kedro notebooks docs

Signed-off-by: debugger24 <rahulcomp24@gmail.com>

* Acknowledge community contribution

---------

Signed-off-by: debugger24 <rahulcomp24@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add the Kedro RichHandler (#2592)

* Add the Kedro RichHandler

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Clean up imports and comments

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Map arguments to add flexibility to customize arguments available in RichHandler

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Clean up rich_logger.py and unused imports

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Add docs - first iteration

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix imports and small refactoring

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Improve the doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Update the project's template

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* format docstring

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Apply Jo's comment on the doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix the link with RichHandler

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Apply suggestions from code review

Co-authored-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>

* Fix broken doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* update template defaults

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* refactoring RichHandler and improve doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix broken test due to new default RichHandler

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Move kedro.extras.logging.RichHandler -> kedro.logging.RichHandler

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* add kedro/logging.py and remove the unnecessary import that cause namespace collision

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Refactor test

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Add more tests

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix lots of import, docstring

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Reorder docstring

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix tests

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Link options to doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Clean up doc and unused logger

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix spelling `customise`

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Refactor test with fixture

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Remove unnecessary copy() because it is now a fixture

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Split up test_rich_traceback_configuration() to smaller tests

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Add RELEASE.md

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Update doc

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix template logging

* Mention awesome-kedro in docs (#2620)

* Mention awesome-kedro in docs

See kedro-org/kedro-devrel#40

* Update CONTRIBUTING.md

* Spelling and style fixes

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

---------

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update SnowparkTableDataSet docs (#2485)

* Update SnowparkTableDataSet docs

* Add missing snowpark dependencies for docs building

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Release 0.18.9 (#2619)

* Update __init__.py

* Update release note

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Update Version Numbers across docs

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Update RELEASE.md

* Update Release Note

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add documentation for deploying packaged Kedro projects on Databricks (#2595)

* Add deployment workflow page

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add table of contents, entry point guide, data and conf upload guide

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add detailed instructions for creating a job on Databricks

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add images and automated deployment resources

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Remove use of 'allows', add summary

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Remove link to missing image

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add deployment workflow to toctree

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Lint and fix missing link

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Minor style, syntax and grammar improvements

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fixes for correctness during validation

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add instructions for creating log output location

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Lint

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Lint databricks_run

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Minor wording change in reference to logs

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Modify reference to Pyspark-Iris

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Fix linter errors to enable docs build for inspection

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update build-docs.sh

* Fix broken link

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove spurious word

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add advantages subheading

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update docs/source/integrations/databricks_deployment_workflow.md

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add alternative ways to upload data to DBFS

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Move note on unpackaged config and data

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix broken links

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Move databricks back into deployment section

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove references to PySpark Iris (pyspark-iris) starter

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Graphics links fixes, revise titles

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Fix broken internal link

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Fix links broken by new folder

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove logs directory

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Update image of final job configuration

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add full stops in list.

Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix conda environment name.

Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Modify wording and image for creating a new job cluster

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Modify wording in guide to create new job cluster

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Remove --upgrade option

Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add both ways of creating a new job

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

---------

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Update docs font family to Inter and swap the logo (#2611)

* Update docs font family to Inter

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Match weights to previous Google font

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update logo images; revert css formatting for easier PR review

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update text styles

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Add trailing whitespace (?)

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update Kedro image path

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update font size

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* update font to font-family: 'Inter', sans-serif;

Signed-off-by: huongg <huongg1409@gmail.com>

* testing with important for font-family

Signed-off-by: huongg <huongg1409@gmail.com>

* Make project name lowercase in docs

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* revert back to kedro/develop

Signed-off-by: huongg <huongg1409@gmail.com>

---------

Signed-off-by: Tynan DeBold <thdebold@gmail.com>
Signed-off-by: huongg <huongg1409@gmail.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: huongg <huongg1409@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Enable variable interpolation for the catalog with `OmegaConfigLoader` (#2621)

* Enable variable interpolation in catalog by escaping _

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Add archive link on docs home page (#2636)

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* add ravi-kumar-pilla to tsc (#2629)

Signed-off-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add file for search console verification (#2631)

* add verification file and change to conf

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Fix conf.py so google file is moved to root of docs build

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

---------

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Chore/update assets for docs (#2625)

* Update docs font family to Inter

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Match weights to previous Google font

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update logo images; revert css formatting for easier PR review

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update text styles

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Add trailing whitespace (?)

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update Kedro image path

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Update font size

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* update font to font-family: 'Inter', sans-serif;

Signed-off-by: huongg <huongg1409@gmail.com>

* testing with important for font-family

Signed-off-by: huongg <huongg1409@gmail.com>

* update assets for experiment tracking, docs, pipeline, focus mode, matplotlib and plotly

* update modular pipelines section

Signed-off-by: huongg <huongg1409@gmail.com>

* autoreload

Signed-off-by: huongg <huongg1409@gmail.com>

* reupload autoreload gif

Signed-off-by: huongg <huongg1409@gmail.com>

* revert back to develop branch

Signed-off-by: huongg <huongg1409@gmail.com>

* gif for ET comparison

Signed-off-by: huongg <huongg1409@gmail.com>

* remove jupyter notebook screenshot since its not used anywhere

Signed-off-by: huongg <huongg1409@gmail.com>

* Update databricks viz demo

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* pipeline_show_metrics gif

Signed-off-by: huongg <huongg1409@gmail.com>

* update different gif format

Signed-off-by: huongg <huongg1409@gmail.com>

* point to main instead of develop for kedro-banner

Signed-off-by: huongg <huongg1409@gmail.com>

---------

Signed-off-by: Tynan DeBold <thdebold@gmail.com>
Signed-off-by: huongg <huongg1409@gmail.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Co-authored-by: Tynan DeBold <thdebold@gmail.com>
Co-authored-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix the level heading for collab exp tracking so it shows in sidebar (#2647)

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Update Kedro icons in docs with for rebrand (#2652)

Signed-off-by: Tynan DeBold <thdebold@gmail.com>

* Modify logos (#2651)

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix lint

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* remove the subsection of kedro.extras.extension

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* Fix lint

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* remove the subsection of kedro.extras.extension

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Tomas Van Pottelbergh <github@tomasvanpottelbergh.com>
Signed-off-by: debugger24 <rahulcomp24@gmail.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Tynan DeBold <thdebold@gmail.com>
Signed-off-by: huongg <huongg1409@gmail.com>
Signed-off-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Co-authored-by: Antony Milne <49395058+antonymilne@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: rashidakanchwala <37628668+rashidakanchwala@users.noreply.github.com>
Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
Co-authored-by: Tomas Van Pottelbergh <85937396+tomasvanpottelbergh@users.noreply.github.com>
Co-authored-by: Rahul Kumar <rahulcomp24@gmail.com>
Co-authored-by: Tynan DeBold <thdebold@gmail.com>
Co-authored-by: Huong Nguyen <32060364+Huongg@users.noreply.github.com>
Co-authored-by: huongg <huongg1409@gmail.com>
Co-authored-by: Ravi Kumar Pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OmegaConfigLoader parameters does not propogate properly
4 participants