Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete kedro.extras.datasets and related tests #3044

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 30 additions & 21 deletions docs/source/configuration/advanced_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,38 @@ The documentation on [configuration](./configuration_basics.md) describes how to
By default, Kedro is set up to use the [ConfigLoader](/kedro.config.ConfigLoader) class. Kedro also provides two additional configuration loaders with more advanced functionality: the [TemplatedConfigLoader](/kedro.config.TemplatedConfigLoader) and the [OmegaConfigLoader](/kedro.config.OmegaConfigLoader).
Each of these classes are alternatives for the default `ConfigLoader` and have different features. The following sections describe each of these classes and their specific functionality in more detail.

## OmegaConfigLoader

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.headings] 'OmegaConfigLoader' should use sentence-style capitalization.


[OmegaConf](https://omegaconf.readthedocs.io/) is a Python library designed to handle and manage settings. It serves as a YAML-based hierarchical system to organise configurations, which can be structured to accommodate various sources, allowing you to merge settings from multiple locations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.toowordy] 'multiple' is too wordy


From Kedro 0.18.5 you can use the [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader) which uses `OmegaConf` to load data.

```{note}
`OmegaConfigLoader` is under active development. It is available from Kedro version 0.18.5 with additional features due in later releases. Let us know if you have any feedback about the `OmegaConfigLoader` by joining the [Kedro community on Slack](https://slack.kedro.org/).
```

`OmegaConfigLoader` can load `YAML` and `JSON` files. Acceptable file extensions are `.yml`, `.yaml`, and `.json`. By default, any configuration files used by the config loaders in Kedro are `.yml` files.

To use `OmegaConfigLoader` in your project, set the `CONFIG_LOADER_CLASS` constant in your [`src/<package_name>/settings.py`](../kedro_project_setup/settings.md):

```python
from kedro.config import OmegaConfigLoader # new import

CONFIG_LOADER_CLASS = OmegaConfigLoader
```
### Advanced `OmegaConfigLoader` features
Some advanced use cases of `OmegaConfigLoader` are listed below:
- [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
- [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
- [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
- [How to load credentials through environment variables](#how-to-load-credentials-through-environment-variables)

## TemplatedConfigLoader

```{warning}
`ConfigLoader` and `TemplatedConfigLoader` have been deprecated since Kedro `0.18.12` and will be removed in Kedro `0.19.0`. Refer to the [migration guide for config loaders](./config_loader_migration.md) for instructions on how to update your code to use `OmegaConfigLoader`.
```

Kedro provides an extension [TemplatedConfigLoader](/kedro.config.TemplatedConfigLoader) class that allows you to template values in configuration files. To apply templating in your project, set the `CONFIG_LOADER_CLASS` constant in your [`src/<package_name>/settings.py`](../kedro_project_setup/settings.md):

```python
Expand Down Expand Up @@ -95,30 +125,9 @@ CONFIG_LOADER_ARGS = {
If you specify both `globals_pattern` and `globals_dict` in `CONFIG_LOADER_ARGS`, the contents of the dictionary resulting from `globals_pattern` are merged with the `globals_dict` dictionary. In case of conflicts, the keys from the `globals_dict` dictionary take precedence.


## OmegaConfigLoader

[OmegaConf](https://omegaconf.readthedocs.io/) is a Python library designed for configuration. It is a YAML-based hierarchical configuration system with support for merging configurations from multiple sources.

From Kedro 0.18.5 you can use the [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader) which uses `OmegaConf` under the hood to load data.

```{note}
`OmegaConfigLoader` is under active development. It was first available from Kedro 0.18.5 with additional features due in later releases. Let us know if you have any feedback about the `OmegaConfigLoader`.
```

`OmegaConfigLoader` can load `YAML` and `JSON` files. Acceptable file extensions are `.yml`, `.yaml`, and `.json`. By default, any configuration files used by the config loaders in Kedro are `.yml` files.

To use `OmegaConfigLoader` in your project, set the `CONFIG_LOADER_CLASS` constant in your [`src/<package_name>/settings.py`](../kedro_project_setup/settings.md):

```python
from kedro.config import OmegaConfigLoader # new import

CONFIG_LOADER_CLASS = OmegaConfigLoader
```

## Advanced Kedro configuration

This section contains a set of guidance for advanced configuration requirements of standard Kedro projects:

* [How to change which configuration files are loaded](#how-to-change-which-configuration-files-are-loaded)
* [How to ensure non default configuration files get loaded](#how-to-ensure-non-default-configuration-files-get-loaded)
* [How to bypass the configuration loading rules](#how-to-bypass-the-configuration-loading-rules)
Expand Down
29 changes: 21 additions & 8 deletions docs/source/configuration/configuration_basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,24 @@ This section contains detailed information about Kedro project configuration, wh

Kedro makes use of a configuration loader to load any project configuration files, and the available configuration loader classes are:

```{warning}
`ConfigLoader` and `TemplatedConfigLoader` have been deprecated since Kedro `0.18.12` and will be removed in Kedro `0.19.0`. Refer to the [migration guide for config loaders](./config_loader_migration.md) for instructions on how to update your code base to use `OmegaConfigLoader`.
```

* [`ConfigLoader`](/kedro.config.ConfigLoader)
* [`TemplatedConfigLoader`](/kedro.config.TemplatedConfigLoader)
* [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader).

By default, Kedro uses the `ConfigLoader` and, in the following sections and examples, you can assume the default `ConfigLoader` is used, unless otherwise specified. The [advanced configuration documentation](./advanced_configuration.md) covers use of the [`TemplatedConfigLoader`](/kedro.config.TemplatedConfigLoader) and [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader) in more detail.
By default, Kedro uses the `ConfigLoader`. However, in projects created with Kedro `0.18.13` onwards, `OmegaConfigLoader` has been set as the config loader as the default in the project's `src/<package_name>/settings.py` file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.toowordy] 'However' is too wordy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.Spellings] Did you really mean 'onwards'?

You can select which config loader you want to use in your project by modifying the `src/<package_name>/settings.py` like this:
```python
from kedro.config import OmegaConfigLoader

CONFIG_LOADER_CLASS = OmegaConfigLoader
```
The following sections and examples are valid for both, the `ConfigLoader` and the `OmegaConfigLoader`. The [advanced configuration documentation](./advanced_configuration.md) covers use of the [`TemplatedConfigLoader`](/kedro.config.TemplatedConfigLoader)
and the advanced use cases of the [`OmegaConfigLoader`](/kedro.config.OmegaConfigLoader) in more detail.


## Configuration source
The configuration source folder is [`conf`](../get_started/kedro_concepts.md#conf) by default. We recommend that you keep all configuration files in the default `conf` folder of a Kedro project.
Expand All @@ -35,7 +48,8 @@ Do not add any local configuration to version control.
```

## Configuration loading
Kedro-specific configuration (e.g., `DataCatalog` configuration for I/O) is loaded using a configuration loader class, by default, this is [`ConfigLoader`](/kedro.config.ConfigLoader).
Kedro-specific configuration (e.g., `DataCatalog` configuration for I/O) is loaded using a configuration loader class, by default, this is [`ConfigLoader`](/kedro.config.ConfigLoader) for

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [vale] reported by reviewdog 🐶
[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.abbreviations] Use 'for example' instead of abbreviations like 'e.g.,'.

projects created with Kedro `0.18.13` or older and has been set to `OmegaConfigLoader` for projects created with Kedro `0.18.13` onwards.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.Spellings] Did you really mean 'onwards'?

When you interact with Kedro through the command line, e.g. by running `kedro run`, Kedro loads all project configuration in the configuration source through this configuration loader.

The loader recursively scans for configuration files inside the `conf` folder, firstly in `conf/base` (`base` being the default environment) and then in `conf/local` (`local` being the designated overriding environment).
Expand All @@ -61,15 +75,14 @@ Configuration files will be matched according to file name and type rules. Suppo
### Configuration patterns
Under the hood, the Kedro configuration loader loads files based on regex patterns that specify the naming convention for configuration files. These patterns are specified by `config_patterns` in the configuration loader classes.

By default those patterns are set as follows for the configuration of catalog, parameters, logging, credentials, and globals:
By default, those patterns are set as follows for the configuration of catalog, parameters, logging, credentials:

```python
config_patterns = {
"catalog": ["catalog*", "catalog*/**", "**/catalog*"],
"parameters": ["parameters*", "parameters*/**", "**/parameters*"],
"credentials": ["credentials*", "credentials*/**", "**/credentials*"],
"logging": ["logging*", "logging*/**", "**/logging*"],
"globals": ["globals*", "globals*/**", "**/globals*"],
}
```

Expand All @@ -80,10 +93,10 @@ If you want to change the way configuration is loaded, you can either [customise
This section contains a set of guidance for the most common configuration requirements of standard Kedro projects:

* [How to change the setting for a configuration source folder](#how-to-change-the-setting-for-a-configuration-source-folder)
* [How to change the configuration source folder at run time](#how-to-change-the-configuration-source-folder-at-runtime)
* [How to change the configuration source folder at runtime](#how-to-change-the-configuration-source-folder-at-runtime)
* [How to read configuration from a compressed file](#how-to-read-configuration-from-a-compressed-file)
* [How to access configuration in code](#how-to-access-configuration-in-code)
* [How to specify additional configuration environments ](#how-to-specify-additional-configuration-environments)
* [How to specify additional configuration environments](#how-to-specify-additional-configuration-environments)
* [How to change the default overriding environment](#how-to-change-the-default-overriding-environment)
* [How to use only one configuration environment](#how-to-use-only-one-configuration-environment)

Expand Down Expand Up @@ -145,12 +158,12 @@ Note that for both the `tar.gz` and `zip` file the following structure is expect
To directly access configuration in code, for example to debug, you can do so as follows:

```python
from kedro.config import ConfigLoader
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings

# Instantiate a ConfigLoader with the location of your project configuration.
conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = ConfigLoader(conf_source=conf_path)
conf_loader = OmegaConfigLoader(conf_source=conf_path)

# This line shows how to access the catalog configuration. You can access other configuration in the same way.
conf_catalog = conf_loader["catalog"]
Expand Down
9 changes: 5 additions & 4 deletions docs/source/configuration/credentials.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,15 @@ Credentials configuration can be used on its own directly in code or [fed into t
If you would rather store your credentials in environment variables instead of a file, you can use the `OmegaConfigLoader` [to load credentials from environment variables](advanced_configuration.md#how-to-load-credentials-through-environment-variables) as described in the advanced configuration chapter.

## How to load credentials in code

Credentials configuration can be loaded the same way as any other project configuration using any of the configuration loader classes: `ConfigLoader`, `TemplatedConfigLoader`, and `OmegaConfigLoader`.

The following examples all use the default `ConfigLoader` class.
The following examples are valid for both, the `ConfigLoader` and the `OmegaConfigLoader`.

```python
from pathlib import Path

from kedro.config import ConfigLoader
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings

# Substitute <project_root> with the [root folder for your project](https://docs.kedro.org/en/stable/tutorial/spaceflights_tutorial.html#terminology)
Expand All @@ -30,11 +31,11 @@ Calling `conf_loader[key]` in the example above throws a `MissingConfigException
```python
from pathlib import Path

from kedro.config import ConfigLoader, MissingConfigException
from kedro.config import OmegaConfigLoader, MissingConfigException
from kedro.framework.project import settings

conf_path = str(Path(<project_root>) / settings.CONF_SOURCE)
conf_loader = ConfigLoader(conf_source=conf_path)
conf_loader = OmegaConfigLoader(conf_source=conf_path)

try:
credentials = conf_loader["credentials"]
Expand Down
10 changes: 5 additions & 5 deletions docs/source/configuration/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,14 @@ You can use `add_feed_dict()` to inject any other entries into your `DataCatalog

Parameters project configuration can be loaded by any of the configuration loader classes: `ConfigLoader`, `TemplatedConfigLoader`, and `OmegaConfigLoader`.

The following examples all make use of the default `ConfigLoader` class.
The following examples all make use of the `OmegaConfigLoader` class.

```python
from kedro.config import ConfigLoader
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings

conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = ConfigLoader(conf_source=conf_path)
conf_loader = OmegaConfigLoader(conf_source=conf_path)
parameters = conf_loader["parameters"]
```

Expand All @@ -92,11 +92,11 @@ This loads configuration files from any subdirectories in `conf` that have a fil
Calling `conf_loader[key]` in the example above will throw a `MissingConfigException` error if no configuration files match the given key. But if this is a valid workflow for your application, you can handle it as follows:

```python
from kedro.config import ConfigLoader, MissingConfigException
from kedro.config import OmegaConfigLoader, MissingConfigException
from kedro.framework.project import settings

conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = ConfigLoader(conf_source=conf_path)
conf_loader = OmegaConfigLoader(conf_source=conf_path)

try:
parameters = conf_loader["parameters"]
Expand Down
4 changes: 2 additions & 2 deletions docs/source/development/automated_testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,14 +63,14 @@ Now that you have a place to put your tests, you can create an example test in t

```
import pytest
from kedro.config import ConfigLoader
from kedro.config import OmegaConfigLoader
from kedro.framework.context import KedroContext
from kedro.framework.hooks import _create_hook_manager


@pytest.fixture
def config_loader():
return ConfigLoader(conf_source=str(Path.cwd()))
return OmegaConfigLoader(conf_source=str(Path.cwd()))


@pytest.fixture
Expand Down
4 changes: 2 additions & 2 deletions docs/source/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,10 @@ Refer to the following table below for a high level guide to each layer's purpos
| Folder in data | Description |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Raw | Initial start of the pipeline, containing the sourced data model(s) that should never be changed, it forms your single source of truth to work from. These data models are typically un-typed in most cases e.g. csv, but this will vary from case to case |
| Intermediate | Optional data model(s), which are introduced to type your :code:`raw` data model(s), e.g. converting string based values into their current typed representation |
| Intermediate | Optional data model(s), which are introduced to type your `raw` data model(s), e.g. converting string based values into their current typed representation |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.abbreviations] Use 'for example' instead of abbreviations like 'e.g.'.

| Primary | Domain specific data model(s) containing cleansed, transformed and wrangled data from either `raw` or `intermediate`, which forms your layer that you input into your feature engineering |
| Feature | Analytics specific data model(s) containing a set of features defined against the `primary` data, which are grouped by feature area of analysis and stored against a common dimension |
| Model input | Analytics specific data model(s) containing all :code:`feature` data against a common dimension and in the case of live projects against an analytics run date to ensure that you track the historical changes of the features over time |
| Model input | Analytics specific data model(s) containing all `feature` data against a common dimension and in the case of live projects against an analytics run date to ensure that you track the historical changes of the features over time |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Kedro.toowordy] 'in the case of' is too wordy

| Models | Stored, serialised pre-trained machine learning models |
| Model output | Analytics specific data model(s) containing the results generated by the model based on the `model input` data |
| Reporting | Reporting data model(s) that are used to combine a set of `primary`, `feature`, `model input` and `model output` data used to drive the dashboard and the views constructed. It encapsulates and removes the need to define any blending or joining of data, improve performance and replacement of presentation layer without having to redefine the data models |
17 changes: 13 additions & 4 deletions docs/source/robots.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
User-agent: *
Disallow: *
Allow: /en/stable
Allow: /en/latest
Allow: /en/0.18.*
Disallow: /
Allow: /en/stable/
Allow: /en/latest/
Allow: /en/0.18.5/
Allow: /en/0.18.6/
Allow: /en/0.18.7/
Allow: /en/0.18.8/
Allow: /en/0.18.9/
Allow: /en/0.18.10/
Allow: /en/0.18.11/
Allow: /en/0.18.12/
Allow: /en/0.18.13/
Allow: /en/0.17.7/
Loading
Loading