Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(datasets): Add rioxarray and RasterDataset (#355)
* refactor(datasets): deprecate "DataSet" type names (#328) * refactor(datasets): deprecate "DataSet" type names (api) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (biosequence) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (dask) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (databricks) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (email) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (geopandas) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (holoviews) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (json) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (matplotlib) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (networkx) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.csv_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.deltatable_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.excel_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.feather_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.gbq_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.generic_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.hdf_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.json_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.parquet_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.sql_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pandas.xml_dataset) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pickle) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (pillow) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (plotly) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (polars) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (redis) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (snowflake) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (spark) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (svmlight) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (tensorflow) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (text) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (tracking) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (video) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): deprecate "DataSet" type names (yaml) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): ignore TensorFlow coverage issues Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added basic code for geotiff Signed-off-by: tgoelles <thomas.goelles@gmail.com> * renamed to xarray Signed-off-by: tgoelles <thomas.goelles@gmail.com> * renamed to xarray Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added load and self args Signed-off-by: tgoelles <thomas.goelles@gmail.com> * only local files Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added empty test Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added test data Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added rioxarray requirements Signed-off-by: tgoelles <thomas.goelles@gmail.com> * reformat with black Signed-off-by: tgoelles <thomas.goelles@gmail.com> * rioxarray 0.14 Signed-off-by: tgoelles <thomas.goelles@gmail.com> * rioxarray 0.15 Signed-off-by: tgoelles <thomas.goelles@gmail.com> * rioxarray 0.12 Signed-off-by: tgoelles <thomas.goelles@gmail.com> * rioxarray 0.9 Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fixed dataset typo Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fixed docstring for sphinx Signed-off-by: tgoelles <thomas.goelles@gmail.com> * run black Signed-off-by: tgoelles <thomas.goelles@gmail.com> * sort imports Signed-off-by: tgoelles <thomas.goelles@gmail.com> * class docstring Signed-off-by: tgoelles <thomas.goelles@gmail.com> * black Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fixed pylint Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added release notes Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added yaml example Signed-off-by: tgoelles <thomas.goelles@gmail.com> * improve testing WIP Signed-off-by: tgoelles <thomas.goelles@gmail.com> * basic test success Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test reloaded Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test exists Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added version Signed-off-by: tgoelles <thomas.goelles@gmail.com> * basic test suite Signed-off-by: tgoelles <thomas.goelles@gmail.com> * run black Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added example and test it Signed-off-by: tgoelles <thomas.goelles@gmail.com> * deleted duplications Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fixed position of example Signed-off-by: tgoelles <thomas.goelles@gmail.com> * black Signed-off-by: tgoelles <thomas.goelles@gmail.com> * style: Introduce `ruff` for linting in all plugins. (#354) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * feat(datasets): create custom `DeprecationWarning` (#356) * feat(datasets): create custom `DeprecationWarning` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * feat(datasets): use the custom deprecation warning Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): show Kedro's deprecation warnings Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * fix(datasets): remove unused imports in test files Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): add note about DataSet deprecation (#357) Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): skip `tensorflow` tests on Windows (#363) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci: Pin `tables` version (#370) * Pin tables version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Also fix kedro-airflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert trying to fix airflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(datasets): Release `1.7.1` (#378) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs: Update CONTRIBUTING.md and add one for `kedro-datasets` (#379) Update CONTRIBUTING.md + add one for kedro-datasets Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(datasets): Run tensorflow tests separately from other dataset tests (#377) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat: Kedro-Airflow convert all pipelines option (#335) * feat: kedro airflow convert --all option Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * docs: release docs Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> --------- Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): blacken code in rst literal blocks (#362) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs: cloudpickle is an interesting extension of the pickle functionality (#361) Signed-off-by: H. Felix Wittmann <hfwittmann@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Fix secret scan entropy error (#383) Fix secret scan entropy error Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * style: Rename mentions of `DataSet` to `Dataset` in `kedro-airflow` and `kedro-telemetry` (#384) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Migrated `PartitionedDataSet` and `IncrementalDataSet` from main repository to kedro-datasets (#253) Signed-off-by: Peter Bludau <ptrbld.dev@gmail.com> Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com> * fix: backwards compatibility for `kedro-airflow` (#381) Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * added metadata Signed-off-by: tgoelles <thomas.goelles@gmail.com> * after linting Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ignore ruff PLR0913 Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Don't warn for SparkDataset on Databricks when using s3 (#341) Signed-off-by: Alistair McKelvie <alistair.mckelvie@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Hot fix for RTD due to bad pip version (#396) fix RTD Signed-off-by: Nok <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Pin pip version temporarily (#398) * Pin pip version temporarily Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Hive support failures Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Also pin pip on lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Temporary ignore databricks spark tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * perf(datasets): don't create connection until need (#281) * perf(datasets): delay `Engine` creation until need Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore: don't check coverage in TYPE_CHECKING block Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): fix tests to touch `create_engine` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * style(datasets): exec Ruff on sql_dataset.py :dog: Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Undo changes to `engines` values type (for Sphinx) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Patch Sphinx build by removing `Engine` references * perf(datasets): don't connect in `__init__` method Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): don't require coverage for import * chore(datasets): del unused `TYPE_CHECKING` import * docs(datasets): document lazy connection in README * perf(datasets): remove create in `SQLQueryDataset` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): do not return the created conn Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore: Drop Python 3.7 support for kedro-plugins (#392) * Remove references to Python 3.7 Signed-off-by: lrcouto <laurarccouto@gmail.com> * Revert kedro-dataset changes Signed-off-by: lrcouto <laurarccouto@gmail.com> * Revert kedro-dataset changes Signed-off-by: lrcouto <laurarccouto@gmail.com> * Add information to release docs Signed-off-by: lrcouto <laurarccouto@gmail.com> --------- Signed-off-by: lrcouto <laurarccouto@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): support Polars lazy evaluation (#350) * feat(datasets) add PolarsDataset to support Polars's Lazy API Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * Fix(datasets): rename PolarsDataSet to PolarsDataSet Add PolarsDataSet as an alias for PolarsDataset with deprecation warning. Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * Fix(datasets): apply ruff linting rules Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * Fix(datasets): Correct pattern matching when Raising exceptions Corrected PolarsDataSet to PolarsDataset in the pattern to match in test_load_missing_file Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * fix(datasets): clean up PolarsDataset related code Remove reference to PolarsDataSet as this is not required for new dataset implementations. Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * feat(datasets): Rename Polars Datasets to better describe their intent Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * feat(datasets): clean up LazyPolarsDataset Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * fix(datasets): increase test coverage for PolarsDataset classes Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * docs(datasets): add renamed Polars datasets to docs Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * docs(datasets): Add new polars datasets to release notes Signed-off-by: Matthias Roels <mroels2@its.jnj.com> * fix(datasets): load_args not properly passed to LazyPolarsDataset.load Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> * docs(datasets): fix spelling error in release notes Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> --------- Signed-off-by: Matthias Roels <matthias.roels21@gmail.com> Signed-off-by: Matthias Roels <mroels2@its.jnj.com> Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: Matthias Roels <mroels2@its.jnj.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(datasets): Release `1.8.0` (#406) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(airflow): Release 0.7.0 (#407) * bump version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update release notes Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(telemetry): Release 0.3.0 (#408) Bump version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(docker): Release 0.4.0 (#409) Bump version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * style(airflow): blacken README.md of Kedro-Airflow (#418) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Fix missing jQuery (#414) Fix missing jQuery Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Fix Lazy Polars dataset to use the new-style base class (#413) * Fix Lazy Polars dataset to use the new-style base class Fix gh-412 Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Update release notes Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Update release notes" This reverts commit 92ceea6d8fa412abf3d8abd28a2f0a22353867ed. --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): lazily load `partitions` classes (#411) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): fix code blocks and `data_set` use (#417) * chore(datasets): lazily load `partitions` classes Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): run doctests to check examples run Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): keep running tests amidst failures Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): format ManagedTableDataset example Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): ignore breaking mods for doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * style(airflow): black code in Kedro-Airflow README Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): fix example syntax, and autoformat Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `>>> ` prefix for YAML code Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): replace `data_set`s with `dataset`s Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): undo changes for running doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * revert(datasets): undo lazily load `partitions` classes Refs: 3fdc5a8efa034fa9a18b7683a942415915b42fb5 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * revert(airflow): undo black code in Kedro-Airflow README Refs: dc3476ea36bac98e2adcc0b52a11b0f90001e31d Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix: TF model load failure when model is saved as a TensorFlow Saved Model format (#410) * fixes TF model load failure when model is saved as a TensorFlow Saved Model format when a model is saved in the TensorFlow SavedModel format ("tf" default option in tf.save_model when using TF 2.x) via the catalog.xml file, the subsequent loading of that model for further use in a subsequent node fails. The issue is linked to the fact that the model files don't get copied into the temporary folder, presumably because the _fs.get function "thinks" that the provided path is a file and not a folder. Adding an terminating "/" to the path fixes the issue. Signed-off-by: Edouard59 <68538605+Edouard59@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Drop support for Python 3.7 on kedro-datasets (#419) * Drop support for Python 3.7 on kedro-datasets Signed-off-by: lrcouto <laurarccouto@gmail.com> * Remove redundant 3.8 markers Signed-off-by: lrcouto <laurarccouto@gmail.com> --------- Signed-off-by: lrcouto <laurarccouto@gmail.com> Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com> Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> * test(datasets): run doctests to check examples run (#416) * chore(datasets): lazily load `partitions` classes Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): run doctests to check examples run Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): keep running tests amidst failures Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): format ManagedTableDataset example Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): ignore breaking mods for doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * style(airflow): black code in Kedro-Airflow README Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): fix example syntax, and autoformat Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `>>> ` prefix for YAML code Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): remove `kedro.extras.datasets` ref Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * docs(datasets): replace `data_set`s with `dataset`s Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * refactor(datasets): run doctests separately Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * separate dataset-doctests Signed-off-by: Nok <nok.lam.chan@quantumblack.com> * chore(datasets): ignore non-passing tests to make CI pass Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): fix comment location Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): fix .py.py Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): don't measure coverage on doctest run Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * build(datasets): fix windows and snowflake stuff in Makefile Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Nok <nok.lam.chan@quantumblack.com> Co-authored-by: Nok <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Add support for `databricks-connect>=13.0` (#352) Signed-off-by: Miguel Rodriguez Gutierrez <miguel7r@hotmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(telemetry): remove double execution by moving to after catalog created hook (#422) * remove double execution by moving to after catalog created hook Signed-off-by: Florian Roessler <roessler.fd@gmail.com> * update release notes Signed-off-by: Florian Roessler <roessler.fd@gmail.com> * fix tests Signed-off-by: Florian Roessler <roessler.fd@gmail.com> * remove unsued fixture Signed-off-by: Florian Roessler <roessler.fd@gmail.com> --------- Signed-off-by: Florian Roessler <roessler.fd@gmail.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs: Add python version support policy to plugin `README.md`s (#425) * Add python version support policy to plugin readmes Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Temporarily pin connexion Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> --------- Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(airflow): Use new docs link (#393) Use new docs link Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * style: Add shared CSS and meganav to datasets docs (#400) * Add shared CSS and meganav Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Add end of file Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Add new heap data source Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * adjust heap parameter Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Remove nav_version next to Kedro logo in top left; add Kedro logo * Revise project name and author name Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Use full kedro icon and type for logo * Add close btn to mobile nav Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Add css for mobile nav logo image Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Update close button for mobile nav Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Add open button to mobile nav Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Delete kedro-datasets/docs/source/kedro-horizontal-color-on-light.svg Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Update conf.py Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> * Update layout.html Add links to subprojects Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * Remove svg from docs -- not needed?? Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> * linter error fix Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> --------- Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com> Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> Co-authored-by: Tynan DeBold <thdebold@gmail.com> Co-authored-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Add Hugging Face datasets (#344) * Add HuggingFace datasets Co-authored-by: Danny Farah <danny_farah@mckinsey.com> Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com> Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com> Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com> Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com> Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com> Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Apply suggestions from code review Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> * Typo Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Fix docstring Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add docstring for HFTransformerPipelineDataset Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Use intersphinx for cross references in Hugging Face docstrings Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add docstring for HFDataset Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add missing test dependencies Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add tests for huggingface datasets Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Fix HFDataset.save Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add test for HFDataset.list_datasets Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Use new name Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Consolidate imports Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Danny Farah <danny_farah@mckinsey.com> Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com> Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com> Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com> Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com> Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com> Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): fix `dask.ParquetDataset` doctests (#439) * test(datasets): fix `dask.ParquetDataset` doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): use `tmp_path` fixture in doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): simplify by not passing the schema Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * test(datasets): ignore conftest for doctests cover Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Create MANIFEST.in Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * refactor: Remove `DataSet` aliases and mentions (#440) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * chore(datasets): replace "Pyspark" with "PySpark" (#423) Consistently write "PySpark" rather than "Pyspark" Also, fix list formatting Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): make `api.APIDataset` doctests run (#448) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): Fix `pandas.GenericDataset` doctest (#445) Fix pandas.GenericDataset doctest Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): make datasets arguments keywords only (#358) * feat(datasets): make `APIDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `BioSequenceDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `ParquetDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `EmailMessageDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `GeoJSONDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `HoloviewsWriter.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `JSONDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `MatplotlibWriter.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `GMLDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `GraphMLDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make NetworkX `JSONDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `PickleDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `ImageDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make plotly `JSONDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `PlotlyDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make polars `CSVDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make polars `GenericDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make redis `PickleDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SnowparkTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SVMLightDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `TensorFlowModelDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `TextDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `YAMLDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `ManagedTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `VideoDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `CSVDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `DeltaTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `ExcelDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `FeatherDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `GBQTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `GenericDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make pandas `JSONDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make pandas `ParquerDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SQLTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `XMLDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `HDFDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `DeltaTableDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SparkDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SparkHiveDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SparkJDBCDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `SparkStreamingDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `IncrementalDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * feat(datasets): make `LazyPolarsDataset.__init__` keyword only Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * docs(datasets): update doctests for HoloviewsWriter Signed-off-by: Felix Scherz <felixwscherz@gmail.com> * Update release notes Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> --------- Signed-off-by: Felix Scherz <felixwscherz@gmail.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Co-authored-by: Felix Scherz <felixwscherz@gmail.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Drop support for python 3.8 on kedro-datasets (#442) * Drop support for python 3.8 on kedro-datasets --------- Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com> Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): add outputs to matplotlib doctests (#449) * test(datasets): add outputs to matplotlib doctests Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Update Makefile Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Reformat code example, line length is short enough * Update kedro-datasets/kedro_datasets/matplotlib/matplotlib_writer.py Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): Fix more doctest issues (#451) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): fix failing doctests in Windows CI (#457) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): fix accidental reference to NumPy (#450) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): don't pollute dev env in doctests (#452) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat: Add tools to heap event (#430) * Add add-on data to heap event Signed-off-by: lrcouto <laurarccouto@gmail.com> * Move addons logic to _get_project_property Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add condition for pyproject.toml Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * add tools to mock Signed-off-by: lrcouto <laurarccouto@gmail.com> * lint Signed-off-by: lrcouto <laurarccouto@gmail.com> * Update tools test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add after_context_created tools test Signed-off-by: lrcouto <laurarccouto@gmail.com> * Update rename to tools Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update kedro-telemetry/tests/test_plugin.py Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> --------- Signed-off-by: lrcouto <laurarccouto@gmail.com> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(datasets): install deps in single `pip install` (#454) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(datasets): Bump s3fs (#463) * Use mocking for AWS responses Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Add change to release notes Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Update release notes Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Use pytest xfail instead of commenting out test Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> --------- Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * test(datasets): make SQL dataset examples runnable (#455) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): correct pandas-gbq as py311 dependency (#460) * update pandas-gbq dependency declaration Signed-off-by: Onur Kuru <kuru.onur1@gmail.com> * fix fmt Signed-off-by: Onur Kuru <kuru.onur1@gmail.com> --------- Signed-off-by: Onur Kuru <kuru.onur1@gmail.com> Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): Document `IncrementalDataset` (#468) Document IncrementalDataset Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Update datasets to be arguments keyword only (#466) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Clean up code for old dataset syntax compatibility (#465) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore: Update scikit-learn version (#469) Update scikit-learn version Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): support versioning data partitions (#447) * feat(datasets): support versioning data partitions Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Remove unused import Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): use keyword arguments when needed Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply suggestions from code review Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Update kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): Improve documentation index (#428) Rework documentation index Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): update wrong docstring about `con` (#461) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(datasets): Release `2.0.0` (#472) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(telemetry): Pin `PyYAML` (#474) Pin PyYaml Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(telemetry): Release 0.3.1 (#475) Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs(datasets): Fix broken links in README (#477) Fix broken links in README Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): replace more "data_set" instances (#476) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): Fix doctests (#488) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): Fix delta + incremental dataset docstrings (#489) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(airflow): Post 0.19 cleanup (#478) * bump version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Unbump version and clean test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Split big test into smaller tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update conftest Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update conftest Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix coverage Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Try unpin airflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * remove datacatalog step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Change node Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * update tasks test step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert to older airflow and constraint pendulum Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update template Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update message in e2e step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Final cleanup Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update kedro-airflow/pyproject.toml Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> * Pin apache-airflow again Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build(airflow): Release 0.8.0 (#491) Bump version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix: telemetry metadata (#495) --------- Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix: Update tests on kedro-docker for 0.5.0 release. (#496) * bump version to 0.5.0 Signed-off-by: lrcouto <laurarccouto@gmail.com> * bump version to 0.5.0 Signed-off-by: lrcouto <laurarccouto@gmail.com> * update e2e tests to use new starters Signed-off-by: lrcouto <laurarccouto@gmail.com> * Lint Signed-off-by: lrcouto <laurarccouto@gmail.com> * update e2e tests to use new starters Signed-off-by: lrcouto <laurarccouto@gmail.com> * fix test path for e2e tests Signed-off-by: lrcouto <laurarccouto@gmail.com> * fix requirements path on dockerfiles Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * Remove redundant test Signed-off-by: lrcouto <laurarccouto@gmail.com> * Alter test for custom GID and UID Signed-off-by: lrcouto <laurarccouto@gmail.com> * Update release notes Signed-off-by: lrcouto <laurarccouto@gmail.com> * Revert version bump to put in in separate PR Signed-off-by: lrcouto <laurarccouto@gmail.com> --------- Signed-off-by: lrcouto <laurarccouto@gmail.com> Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * build: Release kedro-docker 0.5.0 (#497) * bump version to 0.5.0 Signed-off-by: lrcouto <laurarccouto@gmail.com> * bump version to 0.5.0 Signed-off-by: lrcouto <laurarccouto@gmail.com> * update e2e tests to use new starters Signed-off-by: lrcouto <laurarccouto@gmail.com> * Lint Signed-off-by: lrcouto <laurarccouto@gmail.com> * update e2e tests to use new starters Signed-off-by: lrcouto <laurarccouto@gmail.com> * fix test path for e2e tests Signed-off-by: lrcouto <laurarccouto@gmail.com> * fix requirements path on dockerfiles Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * update tests to fit with current log format Signed-off-by: lrcouto <laurarccouto@gmail.com> * Remove redundant test Signed-off-by: lrcouto <laurarccouto@gmail.com> * Alter test for custom GID and UID Signed-off-by: lrcouto <laurarccouto@gmail.com> * Update release notes Signed-off-by: lrcouto <laurarccouto@gmail.com> * Revert version bump to put in in separate PR Signed-off-by: lrcouto <laurarccouto@gmail.com> * Bump kedro-docker to 0.5.0 Signed-off-by: lrcouto <laurarccouto@gmail.com> * Add release notes Signed-off-by: lrcouto <laurarccouto@gmail.com> * Update kedro-docker/RELEASE.md Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com> --------- Signed-off-by: lrcouto <laurarccouto@gmail.com> Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(datasets): Update partitioned dataset docstring (#502) Update partitioned dataset docstring Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * Fix GeotiffDataset import + casing Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> * Fix lint Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Relax pandas.HDFDataSet dependencies which are broken on Windows (#426) * Relax pandas.HDFDataSet dependencies which are broken on Window (#402) Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com> * Update RELEASE.md Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com> * Apply suggestions from code review Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> * Update kedro-datasets/setup.py Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> --------- Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com> Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix: airflow metadata (#498) * Add example pipeline entry to metadata declaration Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Fix entry Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Make entries consistent Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Add tools to config Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * fix: telemetry metadata (#495) --------- Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Revert "Add tools to config" This reverts commit 14732d772a3c2f4787063071a68fdf1512c93488. Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Quick fix Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Remove outdated config key Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Use kedro new instead of cookiecutter Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> --------- Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com> Co-authored-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * chore(airflow): Bump `apache-airflow` version (#511) * Bump apache airflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Change starter Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e test steps Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update e2e test steps Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(datasets): Unpin dask (#522) * Unpin dask Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update doctest Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update doctest Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update kedro-datasets/setup.py Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Add `MatlabDataset` to `kedro-datasets` (#515) * Refork and commit kedro matlab datasets Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg> * Fix lint, add to docs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Try fixing docstring Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Try fixing save Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Try fix docstest Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix unit tests Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update release notes: Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Not hardcode load mode Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(airflow): Pin `Flask-Session` version (#521) * Restrict pendulum version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update airflow init step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Remove pendulum pin Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update create connections step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Pin flask session Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add comment Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat: `kedro-airflow` group in memory nodes (#241) * feat: option to group in-memory nodes Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * fix: MemoryDataset Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * Update kedro-airflow/README.md Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/README.md Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/README.md Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/RELEASE.md Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/kedro_airflow/grouping.py Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/kedro_airflow/plugin.py Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/tests/test_node_grouping.py Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/tests/test_node_grouping.py Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/kedro_airflow/grouping.py Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * Update kedro-airflow/kedro_airflow/grouping.py Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> * fix: tests Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * Bump minimum kedro version Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * fixes Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> --------- Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(datasets): Update pyproject.toml to pin Kedro 0.19 for kedro-datasets (#526) Update pyproject.toml Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(airflow): include environment name in DAG filename (#492) * feat: include environment name in DAG file Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> * doc: add update to release notes Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> --------- Signed-off-by: Simon Brugman <sfbbrugman@gmail.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Enable search-as-you type on Kedro-datasets docs (#532) * done Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix lint Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> --------- Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Debug and fix `kedro-datasets` nightly build failures (#541) * pin deltalake * Update kedro-datasets/setup.py Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> * Update setup.py Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * sort order and compare * Update setup.py * lint * pin deltalake * add comment to pin --------- Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * feat(datasets): Dataset Preview Refactor (#504) * test * done * change from _preview to preview * fix lint and tests * added docstrings * rtd fix * rtd fix * fix rtd Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix rtd Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix rtd - pls" Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * add nitpick ignore Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * test again Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * move tracking datasets to constant Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * remove comma Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * remove Newtype from json_dataset" Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * pls work Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * confirm rtd works locally Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * juanlu's fix Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix tests Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * remove unnecessary stuff from conf.py Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fixes based on review Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * changes based on review Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix tests Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * add suffix Preview Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * change img return type to bytes Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * fix tests Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> * update release note * fix lint --------- Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com> Co-authored-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * fix(datasets): Drop pyarrow constraint when using snowpark (#538) * Free pyarrow req Signed-off-by: Felipe Monroy <felipe.m02@gmail.com> * Free pyarrow req Signed-off-by: Felipe Monroy <felipe.m02@gmail.com> --------- Signed-off-by: Felipe Monroy <felipe.m02@gmail.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * docs: Update kedro-telemetry docs on which data is collected (#546) * Update data being collected --------- Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com> Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com> Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: tgoelles <thomas.goelles@gmail.com> * ci(docker): Trying to fix e2e tests (#548) * Pin psutil Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add no capture to test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update pip version Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * U…
- Loading branch information