Skip to content

0.17.6

Compare
Choose a tag to compare
@idanov idanov released this 09 Dec 15:59
· 1981 commits to main since this release
319a917

Release 0.17.6

Major features and improvements

  • Added pipelines global variable to IPython extension, allowing you to access the project's pipelines in kedro ipython or kedro jupyter notebook.
  • Enabled overriding nested parameters with params in CLI, i.e. kedro run --params="model.model_tuning.booster:gbtree" updates parameters to {"model": {"model_tuning": {"booster": "gbtree"}}}.
  • Added option to pandas.SQLQueryDataSet to specify a filepath with a SQL query, in addition to the current method of supplying the query itself in the sql argument.
  • Extended ExcelDataSet to support saving Excel files with multiple sheets.
  • Added the following new datasets:
Type Description Location
plotly.JSONDataSet Works with plotly graph object Figures (saves as json file) kedro.extras.datasets.plotly
pandas.GenericDataSet Provides a 'best effort' facility to read / write any format provided by the pandas library kedro.extras.datasets.pandas
pandas.GBQQueryDataSet Loads data from a Google Bigquery table using provided SQL query kedro.extras.datasets.pandas
spark.DeltaTableDataSet Dataset designed to handle Delta Lake Tables and their CRUD-style operations, including update, merge and delete kedro.extras.datasets.spark

Bug fixes and other changes

  • Fixed an issue where kedro new --config config.yml was ignoring the config file when prompts.yml didn't exist.
  • Added documentation for kedro viz --autoreload.
  • Added support for arbitrary backends (via importable module paths) that satisfy the pickle interface to PickleDataSet.
  • Added support for sum syntax for connecting pipeline objects.
  • Upgraded pip-tools, which is used by kedro build-reqs, to 6.4. This pip-tools version requires pip>=21.2 while adding support for pip>=21.3. To upgrade pip, please refer to their documentation.
  • Relaxed the bounds on the plotly requirement for plotly.PlotlyDataSet and the pyarrow requirement for pandas.ParquetDataSet.
  • kedro pipeline package <pipeline> now raises an error if the <pipeline> argument doesn't look like a valid Python module path (e.g. has / instead of .).
  • Added new overwrite argument to PartitionedDataSet and MatplotlibWriter to enable deletion of existing partitions and plots on dataset save.
  • kedro pipeline pull now works when the project requirements contains entries such as -r, --extra-index-url and local wheel files (Issue #913).
  • Fixed slow startup because of catalog processing by reducing the exponential growth of extra processing during _FrozenDatasets creations.
  • Removed .coveragerc from the Kedro project template. coverage settings are now given in pyproject.toml.
  • Fixed a bug where packaging or pulling a modular pipeline with the same name as the project's package name would throw an error (or silently pass without including the pipeline source code in the wheel file).
  • Removed unintentional dependency on git.
  • Fixed an issue where nested pipeline configuration was not included in the packaged pipeline.
  • Deprecated the "Thanks for supporting contributions" section of release notes to simplify the contribution process; Kedro 0.17.6 is the last release that includes this. This process has been replaced with the automatic GitHub feature.
  • Fixed a bug where the version on the tracking datasets didn't match the session id and the versions of regular versioned datasets.
  • Fixed an issue where datasets in load_versions that are not found in the data catalog would silently pass.
  • Altered the string representation of nodes so that node inputs/outputs order is preserved rather than being alphabetically sorted.

Upcoming deprecations for Kedro 0.18.0

  • kedro.extras.decorators and kedro.pipeline.decorators are being deprecated in favour of Hooks.
  • kedro.extras.transformers and kedro.io.transformers are being deprecated in favour of Hooks.
  • The --parallel flag on kedro run is being removed in favour of --runner=ParallelRunner. The -p flag will change to be an alias for --pipeline.
  • kedro.io.DataCatalogWithDefault is being deprecated, to be removed entirely in 0.18.0.

Thanks for supporting contributions

Deepyaman Datta,
Brites,
Manish Swami,
Avaneesh Yembadi,
Zain Patel,
Simon Brugman,
Kiyo Kunii,
Benjamin Levy,
Louis de Charsonville,
Simon Picard