Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets): support Polars lazy evaluation #350

Merged

Conversation

MatthiasRoels
Copy link
Contributor

@MatthiasRoels MatthiasRoels commented Sep 27, 2023

Description

For optimal data processing, it is recommended to use Polar's Lazy API (instead of the Eager one) so it is only natural that kedro-datasets supports it too.

Closes #224

Development notes

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@noklam
Copy link
Contributor

noklam commented Sep 27, 2023

You are killing it with this rate of PR 🔥

@MatthiasRoels
Copy link
Contributor Author

MatthiasRoels commented Sep 28, 2023

And just as I created this PR, Polars releases a new version with native support to read parquet files from AWS/GCP/Azure... From what I can see, it should also work for the lazy API.

So, this brings us to the next question: how do we proceed with this PR?

  1. Should we just finish it as-is (allowing versions < 0.19.4) and use arrow to read data AWS/GCP/Azure? I have been using this since beginning of summer and never ran into issue, so I know it works properly!
  2. Bump polars version to 0.19.4 to allow native reading of parquet files from Cloud? Pro: easier to implement. Con: it only works for parquet for now and it looks like it is experimental...

@astrojuanlu
Copy link
Member

Maybe wait a couple of releases to see if it goes from experimental to mature? If we are going to throw away this code in a couple of months, I wouldn't say it makes a lot of sense to continue working on it.

They also said that the first release candidate of polars 1.0 is coming before the end of the year pola-rs/polars#6616 (comment) so maybe things will estabilize soon :)

@MatthiasRoels
Copy link
Contributor Author

I gave it a little thought and if anyone is willing to review, I would like to finish it, despite Polars’ native support for reading/writing to object store. The reason is simple; if this functionality stabilises on Polars’ side, we only have change (read, simplify) our load/save methods!

The way I see it moving forward is that we need 2 datasets: one for eager loading and one for lazy loading. So I would keep the GenericDataset (potentially rename it to e.g. EagerDataset) and rename mine to LazyDataset. The reason I want to keep the two is that there are some file extensions (e.g. Excel) that don’t have a Lazy loading option.

Curious to hear other opinions!

@astrojuanlu
Copy link
Member

There's two options: LazyPolarsGenericDataset (ugh, a bit long) or

if lazy:
  pl.scan_csv(xxx)
else:
  pl.read_csv(xxx)

as @noklam suggested in #224 (comment)

Since the return type is going to be completely different, I'd rather have a different dataset indeed.

@MatthiasRoels MatthiasRoels changed the title feat(datasets) Support Polars lazy evaluation feat: kedro-datasets- support Polars lazy evaluation Oct 2, 2023
@MatthiasRoels MatthiasRoels changed the title feat: kedro-datasets- support Polars lazy evaluation feat(datasets): support Polars lazy evaluation Oct 2, 2023
@MatthiasRoels
Copy link
Contributor Author

Is it ok if I:

  1. Rename my dataset to LazyDataset (other suggestions welcome)
  2. Rename GenericDataset to EagerDataset (other suggestions welcome)

To make it clear that one uses the eager API while the other one uses the Lazy API

Shouldn't we then also consider removing (or mark as deprecated) the CSVDataset? And as a follow-up: should we do the same with the pandas datasets? Basically, the pandas.GenericDataset replaces the CSV, Excel, Feather, JSON, Parquet and XML ones, which should make the code base leaner. This is beyond the scope of this PR obviously...

Add PolarsDataSet as an alias for PolarsDataset with
deprecation warning.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@astrojuanlu
Copy link
Member

Shouldn't we then also consider removing (or mark as deprecated) the CSVDataset? And as a follow-up: should we do the same with the pandas datasets? Basically, the pandas.GenericDataset replaces the CSV, Excel, Feather, JSON, Parquet and XML ones, which should make the code base leaner. This is beyond the scope of this PR obviously...

This is a discussion worth having but it has some implications in the docs, starters etc, could you open a separate issue about it?

@merelcht merelcht mentioned this pull request Oct 11, 2023
3 tasks
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Corrected PolarsDataSet to PolarsDataset in the pattern to match
in test_load_missing_file

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution @MatthiasRoels! I've left some comments, the main being around the naming syntax (classes ending in DataSet are old, any new ones should end in Dataset).

In terms of naming the datasets, how about:

  • LazyPolarsDataset
  • EagerPolarsDataset

@MatthiasRoels
Copy link
Contributor Author

MatthiasRoels commented Oct 12, 2023

Thanks a lot for this contribution @MatthiasRoels! I've left some comments, the main being around the naming syntax (classes ending in DataSet are old, any new ones should end in Dataset).

No worries, I am always happy to contribute!

In terms of naming the datasets, how about:

  • LazyPolarsDataset
  • EagerPolarsDataset

@merelcht That a nice suggestion! But that would introduce a breaking change (renaming the GenericDataset to EagerPolarsDataset). Should I already rename it while keeping the old names with deprecation warnings (like we did for DataSet vs Dataset)?

Remove reference to PolarsDataSet as this is not required for new
dataset implementations.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@merelcht
Copy link
Member

@merelcht That a nice suggestion! But that would introduce a breaking change (renaming the GenericDataset to EagerPolarsDataset). Should I already rename it while keeping the old names with deprecation warnings (like we did for DataSet vs Dataset)?

Yes that sounds good! And then we can remove the alias in the next breaking kedro-datasets release 2.0.0 🙂

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@MatthiasRoels MatthiasRoels force-pushed the feat/datasets-add-polars-lazy-dataset branch from d2409df to 1193218 Compare October 17, 2023 11:21
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried running this, but the code looks good! Can you also add this change to the release note + mention that polars.GenericDataSet will be deprecated and be replaced by polars.EagerPolarsDataset?

@@ -59,6 +59,10 @@ def _collect_requirements(requires):
[
POLARS, "pyarrow>=4.0", "xlsx2csv>=0.8.0", "deltalake >= 0.6.2"
],
"polars.LazyPolarsDataset":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should EagerPolarsDataset be added here as well? If someone uses that directly I'm guessing the requirements would now not be picked up?

Signed-off-by: Matthias Roels <mroels2@its.jnj.com>
@MatthiasRoels
Copy link
Contributor Author

Read the Docs build failed because of a dependency conflict with Dask in test vs docs. Weird that this does not occur in other PR's (or is it?)...

@merelcht
Copy link
Member

Read the Docs build failed because of a dependency conflict with Dask in test vs docs. Weird that this does not occur in other PR's (or is it?)...

We are aware of this issue, it has nothing to do with your PR!

Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
@MatthiasRoels
Copy link
Contributor Author

MatthiasRoels commented Oct 17, 2023

Read the Docs build failed because of a dependency conflict with Dask in test vs docs. Weird that this does not occur in other PR's (or is it?)...

We are aware of this issue, it has nothing to do with your PR!

Is there an open issue for this already?

@astrojuanlu
Copy link
Member

astrojuanlu commented Oct 17, 2023

It got fixed already in #396

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried with a local CSV and it works, but the same CSV on a Minio bucket failed:

In [1]: import polars as pl

In [2]: from kedro_datasets.polars import LazyPolarsDataset

In [3]: ds = LazyPolarsDataset(
   ...:   filepath="s3://temp-openrepair/OpenRepairData_v0.3_aggregate_202210.csv",
   ...:   file_format="csv",
   ...:   load_args=dict(dtypes=dict(product_age=pl.Float64, group_identifier=pl.Utf8), try_parse_dates=True),
   ...: )

In [4]: df_l = ds.load()

In [5]: df_l
Out[5]: <LazyFrame [14 cols, {"id": Utf8"problem": Utf8}] at 0x107F1ABB0>

In [6]: df_l.collect().head()
---------------------------------------------------------------------------
ComputeError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 df_l.collect().head()

File ~/.micromamba/envs/kedro38-dev2/lib/python3.8/site-packages/polars/utils/deprecation.py:96, in deprecate_renamed_parameter.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
     91 @wraps(function)
     92 def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
     93     _rename_keyword_argument(
     94         old_name, new_name, kwargs, function.__name__, version
     95     )
---> 96     return function(*args, **kwargs)

File ~/.micromamba/envs/kedro38-dev2/lib/python3.8/site-packages/polars/lazyframe/frame.py:1787, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, no_optimization, streaming, _eager)
   1774     comm_subplan_elim = False
   1776 ldf = self._ldf.optimization_toggle(
   1777     type_coercion,
   1778     predicate_pushdown,
   (...)
   1785     _eager,
   1786 )
-> 1787 return wrap_df(ldf.collect())

ComputeError: ArrowInvalid: In CSV column #11: Row #9889: CSV conversion error to int64: invalid value 'Fixit Clinic'

Notice that it's trying to use int64 for the group_identifier column, despite having specified pl.Utf8. Maybe load_args is not being properly passed for fsspec files?

The EagerPolarsDataset doesn't have this problem:

In [16]: ds = EagerPolarsDataset(
    ...:   filepath="s3://temp-openrepair/OpenRepairData_v0.3_aggregate_202210.csv",
    ...:   file_format="csv",
    ...:   load_args=dict(dtypes=dict(product_age=pl.Float64, group_identifier=pl.Utf8), try_parse_dates=True),
    ...: )

In [17]: ds.load().head()
Out[17]: 
shape: (5, 14)
┌─────────────────┬───────────────┬─────────┬─────────────────┬───┬─────────────────┬─────────────────┬────────────┬─────────────────┐
│ iddata_providercountrypartner_product ┆ … ┆ repair_barrier_group_identifieevent_dateproblem         │
│ ---------_category       ┆   ┆ if_end_of_lifer------             │
│ strstrstr---             ┆   ┆ ------datestr             │
│                 ┆               ┆         ┆ str             ┆   ┆ strstr             ┆            ┆                 │
╞═════════════════╪═══════════════╪═════════╪═════════════════╪═══╪═════════════════╪═════════════════╪════════════╪═════════════════╡
│ anstiftung_2749anstiftungDEUElektro divers  ┆ … ┆ null50732012-06-20Funktionierte   │
│                 ┆               ┆         ┆ ~ Nähmaschine   ┆   ┆                 ┆                 ┆            ┆ nicht mehr.     │
│                 ┆               ┆         ┆                 ┆   ┆                 ┆                 ┆            ┆ Fehler…         │
│ anstiftung_2750anstiftungDEUComputer ~      ┆ … ┆ null50732012-06-20Wurde schnell   │
│                 ┆               ┆         ┆ Laptop          ┆   ┆                 ┆                 ┆            ┆ heiß. Der       │
│                 ┆               ┆         ┆                 ┆   ┆                 ┆                 ┆            ┆ Lüfter w…       │
│ anstiftung_2746anstiftungDEUComputer ~      ┆ … ┆ null50732012-06-20Funktionierte   │
│                 ┆               ┆         ┆ Drucker         ┆   ┆                 ┆                 ┆            ┆ nicht mehr.     │
│                 ┆               ┆         ┆                 ┆   ┆                 ┆                 ┆            ┆ Fehler…         │
│ anstiftung_2747anstiftungDEUUnterhaltungsel ┆ … ┆ null50732012-06-20Funktionierte   │
│                 ┆               ┆         ┆ ektronik ~      ┆   ┆                 ┆                 ┆            ┆ nicht mehr.     │
│                 ┆               ┆         ┆ Kopfhö…         ┆   ┆                 ┆                 ┆            ┆ Fehler…         │
│ anstiftung_2742anstiftungDEUHaushaltsgeräte ┆ … ┆ null50732012-09-19Die Beine der   │
│                 ┆               ┆         ┆ ~ Spielzeug     ┆   ┆                 ┆                 ┆            ┆ Puppe waren ab. │
│                 ┆               ┆         ┆                 ┆   ┆                 ┆                 ┆            ┆ Si…             │
└─────────────────┴───────────────┴─────────┴─────────────────┴───┴─────────────────┴─────────────────┴────────────┴─────────────────┘

@astrojuanlu
Copy link
Member

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@MatthiasRoels
Copy link
Contributor Author

MatthiasRoels commented Oct 19, 2023

@astrojuanlu: You were correct that the load args were not properly passed when loading files from object stores. However, this will not fix the issue you had because for object stores, we leverage Arrow to load the dataset. This means that you actually have to pass an Arrow schema (instead of polars dtypes). Unfortunately, you have to pass in the full schema, not just the types you want to manually specify... This is downside from the fact that older Polars version could not read directly from Object stores (which, in the newer versions, is in beta).

So you should be able to do the following now:

import polars as pl
import pyarrow as pa 

from kedro_datasets.polars import LazyPolarsDataset

pa_schema = pa.schema(
    [
        ("id", pa.string()),
        ("data_provider", pa.string()),
        ("country", pa.string()),
        ("partner_product_category", pa.string()),
        ("product_category", pa.string()),
        ("product_category_id", pa.int64()),
        ("brand", pa.string()),
        ("year_of_manufacture", pa.int64()),
        ("product_age", pa.float64()),
        ("repair_status", pa.string()),
        ("repair_barrier_if_end_of_life", pa.string()),
        ("group_identifier", pa.string()),
        ("event_date", pa.date32()),
        ("problem", pa.string()),
    ]
)

ds = LazyPolarsDataset(
    filepath="s3://temp-openrepair/OpenRepairData_v0.3_aggregate_202210.csv",
    file_format="csv",
    load_args={"schema": pa_schema},
)

@astrojuanlu
Copy link
Member

I see, thanks! it's a bit annoying to specify the full schema for remote filepaths, do you think this is enough reason to embrace the new method, even if it's beta?

@MatthiasRoels
Copy link
Contributor Author

@astrojuanlu: I think there are pro's and cons of both approaches

Current approach
Pros:

  • It is based on PyArrow which is well-established
  • I have been using it for 3 months with 100+ daily kedro runs on different datasets daily (well tested?)

Cons:

  • you cannot use convenient Polars methods and you have to in the Arrow docs if you want to do something special like overwriting the schema
  • load_args look different when you switch from a file stored to locally to one stored in an object store
  • If we change later on, this change could be considered breaking (everyone will potentially need to change load_args).

Approach when we force a newer version of Polars
Pros:

  • Uniform syntax
  • No further (breaking?) changes required in the future

Cons:

  • We have to force users to use a more recent version of Polars (not sure if that's necessarily a bad thing?)
  • reading from cloud storage using polars is experimental so we might encounter unexpected behaviour (I have also never tested it so I have no idea what to expect).

Given the pros/cons of each, we need to decide how we proceed. I think it comes down to stability vs convenience, no?

@astrojuanlu
Copy link
Member

Good assessment, thanks for the writeup. I keep coming back and forth about this, because I'm pretty confident the new approach will stabilise, but to be honest there's no 100 % guarantee.

Let's proceed with what you have created here, and we can revisit when Polars 1.0 is out 👍🏽

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny comment, but otherwise LGTM! Thank you so much for your contribution @MatthiasRoels it's truly awesome ⭐ 😄

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
@MatthiasRoels MatthiasRoels force-pushed the feat/datasets-add-polars-lazy-dataset branch from 6ea5ac5 to d3f7e5c Compare October 20, 2023 15:44
@astrojuanlu astrojuanlu merged commit b44bf0e into kedro-org:main Oct 20, 2023
@astrojuanlu
Copy link
Member

Congrats on your first contribution @MatthiasRoels ! 🎉

tgoelles pushed a commit to tgoelles/kedro-plugins that referenced this pull request Jun 6, 2024
* feat(datasets) add PolarsDataset to support Polars's Lazy API

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): rename PolarsDataSet to PolarsDataSet

Add PolarsDataSet as an alias for PolarsDataset with
deprecation warning.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): apply ruff linting rules

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): Correct pattern matching when Raising exceptions

Corrected PolarsDataSet to PolarsDataset in the pattern to match
in test_load_missing_file

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): clean up PolarsDataset related code

Remove reference to PolarsDataSet as this is not required for new
dataset implementations.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): Rename Polars Datasets to better describe their intent

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): clean up LazyPolarsDataset

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): increase test coverage for PolarsDataset classes

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): add renamed Polars datasets to docs

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): Add new polars datasets to release notes

Signed-off-by: Matthias Roels <mroels2@its.jnj.com>

* fix(datasets): load_args not properly passed to LazyPolarsDataset.load

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): fix spelling error in release notes

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

---------

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <mroels2@its.jnj.com>
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Matthias Roels <mroels2@its.jnj.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>
astrojuanlu added a commit that referenced this pull request Jul 5, 2024
* refactor(datasets): deprecate "DataSet" type names (#328)

* refactor(datasets): deprecate "DataSet" type names (api)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (biosequence)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (dask)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (databricks)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (email)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (geopandas)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (holoviews)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (json)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (matplotlib)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (networkx)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.csv_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.deltatable_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.excel_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.feather_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.gbq_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.generic_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.hdf_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.json_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.parquet_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.sql_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.xml_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pickle)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pillow)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (plotly)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (polars)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (redis)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (snowflake)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (spark)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (svmlight)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (tensorflow)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (text)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (tracking)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (video)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (yaml)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore TensorFlow coverage issues

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added basic code for geotiff

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* renamed to xarray

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* renamed to xarray

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added load and self args

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* only local files

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added empty test

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added test data

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added rioxarray requirements

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* reformat with black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.14

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.15

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.12

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.9

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed dataset typo

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed docstring for sphinx

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* run black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* sort imports

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* class docstring

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed pylint

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added release notes

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added yaml example

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* improve testing WIP

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* basic test success

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test reloaded

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test exists

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added version

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* basic test suite

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* run black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added example and test it

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* deleted duplications

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed position of example

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Introduce `ruff` for linting in all plugins. (#354)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* feat(datasets): create custom `DeprecationWarning` (#356)

* feat(datasets): create custom `DeprecationWarning`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* feat(datasets): use the custom deprecation warning

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): show Kedro's deprecation warnings

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* fix(datasets): remove unused imports in test files

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): add note about DataSet deprecation (#357)

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): skip `tensorflow` tests on Windows (#363)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci: Pin `tables` version (#370)

* Pin tables version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Also fix kedro-airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert trying to fix airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `1.7.1` (#378)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Update CONTRIBUTING.md and add one for `kedro-datasets` (#379)

Update CONTRIBUTING.md + add one for kedro-datasets

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Run tensorflow tests separately from other dataset tests (#377)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: Kedro-Airflow convert all pipelines option (#335)

* feat: kedro airflow convert --all option

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* docs: release docs

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): blacken code in rst literal blocks (#362)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: cloudpickle is an interesting extension of the pickle functionality (#361)

Signed-off-by: H. Felix Wittmann <hfwittmann@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix secret scan entropy error (#383)

Fix secret scan entropy error

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Rename mentions of `DataSet` to `Dataset` in `kedro-airflow` and `kedro-telemetry` (#384)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Migrated `PartitionedDataSet` and `IncrementalDataSet` from main repository to kedro-datasets (#253)

Signed-off-by: Peter Bludau <ptrbld.dev@gmail.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>

* fix: backwards compatibility for `kedro-airflow` (#381)

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added metadata

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* after linting

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ignore ruff PLR0913

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Don't warn for SparkDataset on Databricks when using s3 (#341)

Signed-off-by: Alistair McKelvie <alistair.mckelvie@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Hot fix for RTD due to bad pip version (#396)

fix RTD

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Pin pip version temporarily (#398)

* Pin pip version temporarily

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Hive support failures

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Also pin pip on lint

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Temporary ignore databricks spark tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* perf(datasets): don't create connection until need (#281)

* perf(datasets): delay `Engine` creation until need

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore: don't check coverage in TYPE_CHECKING block

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): fix tests to touch `create_engine`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(datasets): exec Ruff on sql_dataset.py :dog:

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Undo changes to `engines` values type (for Sphinx)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Patch Sphinx build by removing `Engine` references

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): don't require coverage for import

* chore(datasets): del unused `TYPE_CHECKING` import

* docs(datasets): document lazy connection in README

* perf(datasets): remove create in `SQLQueryDataset`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): do not return the created conn

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore: Drop Python 3.7 support for kedro-plugins (#392)

* Remove references to Python 3.7

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert kedro-dataset changes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert kedro-dataset changes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Add information to release docs

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): support Polars lazy evaluation  (#350)

* feat(datasets) add PolarsDataset to support Polars's Lazy API

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): rename PolarsDataSet to PolarsDataSet

Add PolarsDataSet as an alias for PolarsDataset with
deprecation warning.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): apply ruff linting rules

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): Correct pattern matching when Raising exceptions

Corrected PolarsDataSet to PolarsDataset in the pattern to match
in test_load_missing_file

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): clean up PolarsDataset related code

Remove reference to PolarsDataSet as this is not required for new
dataset implementations.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): Rename Polars Datasets to better describe their intent

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): clean up LazyPolarsDataset

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): increase test coverage for PolarsDataset classes

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): add renamed Polars datasets to docs

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): Add new polars datasets to release notes

Signed-off-by: Matthias Roels <mroels2@its.jnj.com>

* fix(datasets): load_args not properly passed to LazyPolarsDataset.load

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): fix spelling error in release notes

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

---------

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <mroels2@its.jnj.com>
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Matthias Roels <mroels2@its.jnj.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `1.8.0` (#406)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(airflow): Release 0.7.0 (#407)

* bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update release notes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(telemetry): Release 0.3.0 (#408)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(docker): Release 0.4.0 (#409)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style(airflow): blacken README.md of Kedro-Airflow (#418)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix missing jQuery (#414)

Fix missing jQuery

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix Lazy Polars dataset to use the new-style base class (#413)

* Fix Lazy Polars dataset to use the new-style base class

Fix gh-412

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Update release notes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert "Update release notes"

This reverts commit 92ceea6d8fa412abf3d8abd28a2f0a22353867ed.

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets):  lazily load `partitions` classes (#411)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): fix code blocks and `data_set` use (#417)

* chore(datasets):  lazily load `partitions` classes

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): run doctests to check examples run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): keep running tests amidst failures

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): format ManagedTableDataset example

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore breaking mods for doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(airflow): black code in Kedro-Airflow README

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): fix example syntax, and autoformat

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `>>> ` prefix for YAML code

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): replace `data_set`s with `dataset`s

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): undo changes for running doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* revert(datasets):  undo lazily load `partitions` classes

Refs: 3fdc5a8efa034fa9a18b7683a942415915b42fb5
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* revert(airflow): undo black code in Kedro-Airflow README

Refs: dc3476ea36bac98e2adcc0b52a11b0f90001e31d

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: TF model load failure when model is saved as a TensorFlow Saved Model format (#410)

* fixes TF model load failure when model is saved as a TensorFlow Saved Model format

when a model is saved in the TensorFlow SavedModel format ("tf" default option in tf.save_model when using TF 2.x) via the catalog.xml file, the subsequent loading of that model for further use in a subsequent node fails. The issue is linked to the fact that the model files don't get copied into the temporary folder, presumably because the _fs.get function "thinks" that the provided path is a file and not a folder. Adding an terminating "/" to the path fixes the issue.

Signed-off-by: Edouard59 <68538605+Edouard59@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Drop support for Python 3.7 on kedro-datasets (#419)

* Drop support for Python 3.7 on kedro-datasets

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant 3.8 markers

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>

* test(datasets): run doctests to check examples run (#416)

* chore(datasets):  lazily load `partitions` classes

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): run doctests to check examples run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): keep running tests amidst failures

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): format ManagedTableDataset example

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore breaking mods for doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(airflow): black code in Kedro-Airflow README

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): fix example syntax, and autoformat

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `>>> ` prefix for YAML code

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): replace `data_set`s with `dataset`s

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): run doctests separately

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* separate dataset-doctests

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* chore(datasets): ignore non-passing tests to make CI pass

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): fix comment location

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): fix .py.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): don't measure coverage on doctest run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* build(datasets): fix windows and snowflake stuff in Makefile

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Co-authored-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add support for `databricks-connect>=13.0` (#352)

Signed-off-by: Miguel Rodriguez Gutierrez <miguel7r@hotmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(telemetry): remove double execution by moving to after catalog created hook (#422)

* remove double execution by moving to after catalog created hook

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* update release notes

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* fix tests

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* remove unsued fixture

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

---------

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Add python version support policy to plugin `README.md`s (#425)

* Add python version support policy to plugin readmes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Temporarily pin connexion

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(airflow): Use new docs link (#393)

Use new docs link

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Add shared CSS and meganav to datasets docs (#400)

* Add shared CSS and meganav

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add end of file

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add new heap data source

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* adjust heap parameter

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove nav_version next to Kedro logo in top left; add Kedro logo

* Revise project name and author name

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Use full kedro icon and type for logo

* Add close btn to mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Add css for mobile nav logo image

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update close button for mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Add open button to mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Delete kedro-datasets/docs/source/kedro-horizontal-color-on-light.svg

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update conf.py

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update layout.html

Add links to subprojects

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove svg from docs -- not needed??

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* linter error fix

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

---------

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>
Co-authored-by: Tynan DeBold <thdebold@gmail.com>
Co-authored-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add Hugging Face datasets (#344)

* Add HuggingFace datasets

Co-authored-by: Danny Farah <danny_farah@mckinsey.com>
Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com>
Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com>
Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com>
Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com>
Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com>
Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Apply suggestions from code review

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* Typo

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix docstring

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add docstring for HFTransformerPipelineDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Use intersphinx for cross references in Hugging Face docstrings

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add docstring for HFDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add missing test dependencies

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add tests for huggingface datasets

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix HFDataset.save

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add test for HFDataset.list_datasets

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Use new name

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Consolidate imports

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Danny Farah <danny_farah@mckinsey.com>
Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com>
Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com>
Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com>
Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com>
Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com>
Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): fix `dask.ParquetDataset` doctests (#439)

* test(datasets): fix `dask.ParquetDataset` doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): use `tmp_path` fixture in doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): simplify by not passing the schema

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): ignore conftest for doctests cover

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Create MANIFEST.in

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* refactor: Remove `DataSet` aliases and mentions (#440)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* chore(datasets): replace "Pyspark" with "PySpark" (#423)

Consistently write "PySpark" rather than "Pyspark"

Also, fix list formatting

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): make `api.APIDataset` doctests run (#448)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix `pandas.GenericDataset` doctest (#445)

Fix pandas.GenericDataset doctest

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): make datasets arguments keywords only (#358)

* feat(datasets): make `APIDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `BioSequenceDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ParquetDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `EmailMessageDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GeoJSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `HoloviewsWriter.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `MatplotlibWriter.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GraphMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make NetworkX `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `PickleDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ImageDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make plotly `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `PlotlyDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make polars `CSVDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make polars `GenericDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make redis `PickleDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SnowparkTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SVMLightDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `TensorFlowModelDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `TextDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `YAMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ManagedTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `VideoDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `CSVDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `DeltaTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ExcelDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `FeatherDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GBQTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GenericDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make pandas `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make pandas `ParquerDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SQLTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `XMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `HDFDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `DeltaTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkHiveDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkJDBCDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkStreamingDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `IncrementalDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `LazyPolarsDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* docs(datasets): update doctests for HoloviewsWriter

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* Update release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Felix Scherz <felixwscherz@gmail.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Drop support for python 3.8 on kedro-datasets (#442)

* Drop support for python 3.8 on kedro-datasets

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): add outputs to matplotlib doctests (#449)

* test(datasets): add outputs to matplotlib doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Update Makefile

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Reformat code example, line length is short enough

* Update kedro-datasets/kedro_datasets/matplotlib/matplotlib_writer.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix more doctest issues  (#451)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): fix failing doctests in Windows CI (#457)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): fix accidental reference to NumPy (#450)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): don't pollute dev env in doctests (#452)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: Add tools to heap event (#430)

* Add add-on data to heap event

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Move addons logic to _get_project_property

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add condition for pyproject.toml

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* add tools to mock

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update tools test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add after_context_created tools test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update rename to tools

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-telemetry/tests/test_plugin.py

Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): install deps in single `pip install` (#454)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Bump s3fs (#463)

* Use mocking for AWS responses

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Add change to release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Update release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Use pytest xfail instead of commenting out test

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): make SQL dataset examples runnable (#455)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): correct pandas-gbq as py311 dependency (#460)

* update pandas-gbq dependency declaration

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>

* fix fmt

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>

---------

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>
Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Document `IncrementalDataset` (#468)

Document IncrementalDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Update datasets to be arguments keyword only (#466)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Clean up code for old dataset syntax compatibility (#465)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Update scikit-learn version (#469)

Update scikit-learn version

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): support versioning data partitions (#447)

* feat(datasets): support versioning data partitions

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Remove unused import

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): use keyword arguments when needed

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply suggestions from code review

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Update kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Improve documentation index (#428)

Rework documentation index

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): update wrong docstring about `con` (#461)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `2.0.0`  (#472)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(telemetry): Pin `PyYAML` (#474)

Pin PyYaml

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(telemetry): Release 0.3.1 (#475)

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Fix broken links in README (#477)

Fix broken links in README

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): replace more "data_set" instances (#476)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix doctests (#488)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix delta + incremental dataset docstrings (#489)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(airflow): Post 0.19 cleanup (#478)

* bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Unbump version and clean test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Split big test into smaller tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update conftest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update conftest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix coverage

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try unpin airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* remove datacatalog step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Change node

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* update tasks test step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert to older airflow and constraint pendulum

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update template

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update message in e2e step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Final cleanup

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-airflow/pyproject.toml

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* Pin apache-airflow again

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(airflow): Release 0.8.0 (#491)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: telemetry metadata (#495)

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: Update tests on kedro-docker for 0.5.0 release. (#496)

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix test path for e2e tests

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix requirements path on dockerfiles

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Alter test for custom GID and UID

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert version bump to put in in separate PR

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build: Release kedro-docker 0.5.0 (#497)

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix test path for e2e tests

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix requirements path on dockerfiles

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Alter test for custom GID and UID

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert version bump to put in in separate PR

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Bump kedro-docker to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Add release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update kedro-docker/RELEASE.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Update partitioned dataset docstring (#502)

Update partitioned dataset docstring

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* Fix GeotiffDataset import + casing

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Fix lint

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Relax pandas.HDFDataSet dependencies which are broken on Windows (#426)

* Relax pandas.HDFDataSet dependencies which are broken on Window (#402)

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>

* Update RELEASE.md

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>

* Apply suggestions from code review

Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>

* Update kedro-datasets/setup.py

Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>

---------

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: airflow metadata (#498)

* Add example pipeline entry to metadata declaration

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix entry

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Make entries consistent

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add tools to config

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* fix: telemetry metadata (#495)

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Revert "Add tools to config"

This reverts commit 14732d772a3c2f4787063071a68fdf1512c93488.

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Quick fix

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Lint

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove outdated config key

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Use kedro new instead of cookiecutter

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

---------

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Co-authored-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(airflow): Bump `apache-airflow` version (#511)

* Bump apache airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Change starter

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e test steps

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e test steps

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Unpin dask (#522)

* Unpin dask

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update doctest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update doctest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-datasets/setup.py

Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add `MatlabDataset` to `kedro-datasets` (#515)

* Refork and commit kedro matlab datasets

Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg>

* Fix lint, add to docs

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fixing docstring

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fixing save

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fix docstest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix unit tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update release notes:

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Not hardcode load mode

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(airflow): Pin `Flask-Session` version (#521)

* Restrict pendulum version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update airflow init step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Remove pendulum pin

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update create connections step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Pin flask session

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add comment

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: `kedro-airflow` group in memory nodes (#241)

* feat: option to group in-memory nodes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* fix: MemoryDataset

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/RELEASE.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/plugin.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/tests/test_node_grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/tests/test_node_grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* fix: tests

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* Bump minimum kedro version

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* fixes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Update pyproject.toml to pin Kedro 0.19 for kedro-datasets (#526)

Update pyproject.toml

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(airflow): include environment name in DAG filename (#492)

* feat: include environment name in DAG file

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* doc: add update to release notes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Enable search-as-you type on Kedro-datasets docs (#532)

* done

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix lint

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

---------

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Debug and fix `kedro-datasets` nightly build failures (#541)

* pin deltalake

* Update kedro-datasets/setup.py

Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>

* Update setup.py

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* sort order and compare

* Update setup.py

* lint

* pin deltalake

* add comment to pin

---------

Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Dataset Preview Refactor  (#504)

* test

* done

* change from _preview to preview

* fix lint and tests

* added docstrings

* rtd fix

* rtd fix

* fix rtd

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix rtd

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix rtd - pls"

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* add nitpick ignore

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* test again

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* move tracking datasets to constant

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove comma

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove Newtype from json_dataset"

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* pls work

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* confirm rtd works locally

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* juanlu's fix

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove unnecessary stuff from conf.py

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fixes based on review

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* changes based on review

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* add suffix Preview

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* change img return type to bytes

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* update release note

* fix lint

---------

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>
Co-authored-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Drop pyarrow constraint when using snowpark (#538)

* Free pyarrow req

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>

* Free pyarrow req

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>

---------

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Update kedro-telemetry docs on which data is collected (#546)

* Update data being collected
---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(docker): Trying to fix e2e tests (#548)

* Pin psutil

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add no capture to test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update pip version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* U…
merelcht added a commit to galenseilis/kedro-plugins that referenced this pull request Aug 27, 2024
* refactor(datasets): deprecate "DataSet" type names (#328)

* refactor(datasets): deprecate "DataSet" type names (api)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (biosequence)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (dask)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (databricks)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (email)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (geopandas)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (holoviews)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (json)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (matplotlib)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (networkx)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.csv_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.deltatable_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.excel_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.feather_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.gbq_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.generic_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.hdf_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.json_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.parquet_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.sql_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pandas.xml_dataset)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pickle)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (pillow)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (plotly)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (polars)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (redis)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (snowflake)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (spark)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (svmlight)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (tensorflow)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (text)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (tracking)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (video)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): deprecate "DataSet" type names (yaml)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore TensorFlow coverage issues

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added basic code for geotiff

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* renamed to xarray

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* renamed to xarray

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added load and self args

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* only local files

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added empty test

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added test data

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added rioxarray requirements

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* reformat with black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.14

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.15

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.12

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* rioxarray 0.9

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed dataset typo

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed docstring for sphinx

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* run black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* sort imports

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* class docstring

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed pylint

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added release notes

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added yaml example

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* improve testing WIP

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* basic test success

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test reloaded

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test exists

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added version

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* basic test suite

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* run black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added example and test it

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* deleted duplications

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fixed position of example

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* black

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Introduce `ruff` for linting in all plugins. (#354)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* feat(datasets): create custom `DeprecationWarning` (#356)

* feat(datasets): create custom `DeprecationWarning`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* feat(datasets): use the custom deprecation warning

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): show Kedro's deprecation warnings

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* fix(datasets): remove unused imports in test files

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): add note about DataSet deprecation (#357)

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): skip `tensorflow` tests on Windows (#363)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci: Pin `tables` version (#370)

* Pin tables version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Also fix kedro-airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert trying to fix airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `1.7.1` (#378)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Update CONTRIBUTING.md and add one for `kedro-datasets` (#379)

Update CONTRIBUTING.md + add one for kedro-datasets

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Run tensorflow tests separately from other dataset tests (#377)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: Kedro-Airflow convert all pipelines option (#335)

* feat: kedro airflow convert --all option

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* docs: release docs

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): blacken code in rst literal blocks (#362)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: cloudpickle is an interesting extension of the pickle functionality (#361)

Signed-off-by: H. Felix Wittmann <hfwittmann@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix secret scan entropy error (#383)

Fix secret scan entropy error

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Rename mentions of `DataSet` to `Dataset` in `kedro-airflow` and `kedro-telemetry` (#384)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Migrated `PartitionedDataSet` and `IncrementalDataSet` from main repository to kedro-datasets (#253)

Signed-off-by: Peter Bludau <ptrbld.dev@gmail.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>

* fix: backwards compatibility for `kedro-airflow` (#381)

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* added metadata

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* after linting

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ignore ruff PLR0913

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Don't warn for SparkDataset on Databricks when using s3 (#341)

Signed-off-by: Alistair McKelvie <alistair.mckelvie@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Hot fix for RTD due to bad pip version (#396)

fix RTD

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Pin pip version temporarily (#398)

* Pin pip version temporarily

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Hive support failures

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Also pin pip on lint

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Temporary ignore databricks spark tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* perf(datasets): don't create connection until need (#281)

* perf(datasets): delay `Engine` creation until need

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore: don't check coverage in TYPE_CHECKING block

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): fix tests to touch `create_engine`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(datasets): exec Ruff on sql_dataset.py :dog:

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Undo changes to `engines` values type (for Sphinx)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Patch Sphinx build by removing `Engine` references

* perf(datasets): don't connect in `__init__` method

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): don't require coverage for import

* chore(datasets): del unused `TYPE_CHECKING` import

* docs(datasets): document lazy connection in README

* perf(datasets): remove create in `SQLQueryDataset`

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): do not return the created conn

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore: Drop Python 3.7 support for kedro-plugins (#392)

* Remove references to Python 3.7

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert kedro-dataset changes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert kedro-dataset changes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Add information to release docs

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): support Polars lazy evaluation  (#350)

* feat(datasets) add PolarsDataset to support Polars's Lazy API

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): rename PolarsDataSet to PolarsDataSet

Add PolarsDataSet as an alias for PolarsDataset with
deprecation warning.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): apply ruff linting rules

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* Fix(datasets): Correct pattern matching when Raising exceptions

Corrected PolarsDataSet to PolarsDataset in the pattern to match
in test_load_missing_file

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): clean up PolarsDataset related code

Remove reference to PolarsDataSet as this is not required for new
dataset implementations.

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): Rename Polars Datasets to better describe their intent

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* feat(datasets): clean up LazyPolarsDataset

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* fix(datasets): increase test coverage for PolarsDataset classes

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): add renamed Polars datasets to docs

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): Add new polars datasets to release notes

Signed-off-by: Matthias Roels <mroels2@its.jnj.com>

* fix(datasets): load_args not properly passed to LazyPolarsDataset.load

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

* docs(datasets): fix spelling error in release notes

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>

---------

Signed-off-by: Matthias Roels <matthias.roels21@gmail.com>
Signed-off-by: Matthias Roels <mroels2@its.jnj.com>
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Matthias Roels <mroels2@its.jnj.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `1.8.0` (#406)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(airflow): Release 0.7.0 (#407)

* bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update release notes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(telemetry): Release 0.3.0 (#408)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(docker): Release 0.4.0 (#409)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style(airflow): blacken README.md of Kedro-Airflow (#418)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix missing jQuery (#414)

Fix missing jQuery

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Fix Lazy Polars dataset to use the new-style base class (#413)

* Fix Lazy Polars dataset to use the new-style base class

Fix gh-412

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Update release notes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert "Update release notes"

This reverts commit 92ceea6d8fa412abf3d8abd28a2f0a22353867ed.

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets):  lazily load `partitions` classes (#411)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): fix code blocks and `data_set` use (#417)

* chore(datasets):  lazily load `partitions` classes

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): run doctests to check examples run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): keep running tests amidst failures

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): format ManagedTableDataset example

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore breaking mods for doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(airflow): black code in Kedro-Airflow README

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): fix example syntax, and autoformat

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `>>> ` prefix for YAML code

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): replace `data_set`s with `dataset`s

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): undo changes for running doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* revert(datasets):  undo lazily load `partitions` classes

Refs: 3fdc5a8efa034fa9a18b7683a942415915b42fb5
Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* revert(airflow): undo black code in Kedro-Airflow README

Refs: dc3476ea36bac98e2adcc0b52a11b0f90001e31d

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: TF model load failure when model is saved as a TensorFlow Saved Model format (#410)

* fixes TF model load failure when model is saved as a TensorFlow Saved Model format

when a model is saved in the TensorFlow SavedModel format ("tf" default option in tf.save_model when using TF 2.x) via the catalog.xml file, the subsequent loading of that model for further use in a subsequent node fails. The issue is linked to the fact that the model files don't get copied into the temporary folder, presumably because the _fs.get function "thinks" that the provided path is a file and not a folder. Adding an terminating "/" to the path fixes the issue.

Signed-off-by: Edouard59 <68538605+Edouard59@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Drop support for Python 3.7 on kedro-datasets (#419)

* Drop support for Python 3.7 on kedro-datasets

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant 3.8 markers

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>

* test(datasets): run doctests to check examples run (#416)

* chore(datasets):  lazily load `partitions` classes

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): run doctests to check examples run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): keep running tests amidst failures

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): format ManagedTableDataset example

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): ignore breaking mods for doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* style(airflow): black code in Kedro-Airflow README

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): fix example syntax, and autoformat

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `>>> ` prefix for YAML code

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): remove `kedro.extras.datasets` ref

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* docs(datasets): replace `data_set`s with `dataset`s

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* refactor(datasets): run doctests separately

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* separate dataset-doctests

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* chore(datasets): ignore non-passing tests to make CI pass

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): fix comment location

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): fix .py.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): don't measure coverage on doctest run

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* build(datasets): fix windows and snowflake stuff in Makefile

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Co-authored-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add support for `databricks-connect>=13.0` (#352)

Signed-off-by: Miguel Rodriguez Gutierrez <miguel7r@hotmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(telemetry): remove double execution by moving to after catalog created hook (#422)

* remove double execution by moving to after catalog created hook

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* update release notes

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* fix tests

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

* remove unsued fixture

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>

---------

Signed-off-by: Florian Roessler <roessler.fd@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Add python version support policy to plugin `README.md`s (#425)

* Add python version support policy to plugin readmes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Temporarily pin connexion

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(airflow): Use new docs link (#393)

Use new docs link

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* style: Add shared CSS and meganav to datasets docs (#400)

* Add shared CSS and meganav

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add end of file

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Add new heap data source

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* adjust heap parameter

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove nav_version next to Kedro logo in top left; add Kedro logo

* Revise project name and author name

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Use full kedro icon and type for logo

* Add close btn to mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Add css for mobile nav logo image

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update close button for mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Add open button to mobile nav

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Delete kedro-datasets/docs/source/kedro-horizontal-color-on-light.svg

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update conf.py

Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>

* Update layout.html

Add links to subprojects

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* Remove svg from docs -- not needed??

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

* linter error fix

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>

---------

Signed-off-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>
Co-authored-by: Tynan DeBold <thdebold@gmail.com>
Co-authored-by: vladimir-mck <106236933+vladimir-mck@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add Hugging Face datasets (#344)

* Add HuggingFace datasets

Co-authored-by: Danny Farah <danny_farah@mckinsey.com>
Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com>
Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com>
Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com>
Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com>
Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com>
Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Apply suggestions from code review

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* Typo

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix docstring

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add docstring for HFTransformerPipelineDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Use intersphinx for cross references in Hugging Face docstrings

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add docstring for HFDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add missing test dependencies

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add tests for huggingface datasets

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Fix HFDataset.save

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Add test for HFDataset.list_datasets

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Use new name

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Consolidate imports

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Danny Farah <danny_farah@mckinsey.com>
Co-authored-by: Kevin Koga <Kevin_Koga@mckinsey.com>
Co-authored-by: Mate Scharnitzky <Mate_Scharnitzky@mckinsey.com>
Co-authored-by: Tomer Shor <Tomer_Shor@mckinsey.com>
Co-authored-by: Pierre-Yves Mousset <Pierre-Yves_Mousset@mckinsey.com>
Co-authored-by: Bela Chupal <Bela_chuphal@mckinsey.com>
Co-authored-by: Khangjrakpam Arjun <Khangjrakpam_Arjun@mckinsey.com>
Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): fix `dask.ParquetDataset` doctests (#439)

* test(datasets): fix `dask.ParquetDataset` doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): use `tmp_path` fixture in doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): simplify by not passing the schema

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* test(datasets): ignore conftest for doctests cover

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Create MANIFEST.in

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* refactor: Remove `DataSet` aliases and mentions (#440)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* chore(datasets): replace "Pyspark" with "PySpark" (#423)

Consistently write "PySpark" rather than "Pyspark"

Also, fix list formatting

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): make `api.APIDataset` doctests run (#448)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix `pandas.GenericDataset` doctest (#445)

Fix pandas.GenericDataset doctest

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): make datasets arguments keywords only (#358)

* feat(datasets): make `APIDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `BioSequenceDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ParquetDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `EmailMessageDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GeoJSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `HoloviewsWriter.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `MatplotlibWriter.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GraphMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make NetworkX `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `PickleDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ImageDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make plotly `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `PlotlyDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make polars `CSVDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make polars `GenericDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make redis `PickleDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SnowparkTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SVMLightDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `TensorFlowModelDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `TextDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `YAMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ManagedTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `VideoDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `CSVDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `DeltaTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `ExcelDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `FeatherDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GBQTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `GenericDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make pandas `JSONDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make pandas `ParquerDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SQLTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `XMLDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `HDFDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `DeltaTableDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkHiveDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkJDBCDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `SparkStreamingDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `IncrementalDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* feat(datasets): make `LazyPolarsDataset.__init__` keyword only

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* docs(datasets): update doctests for HoloviewsWriter

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>

* Update release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Felix Scherz <felixwscherz@gmail.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Felix Scherz <felixwscherz@gmail.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Drop support for python 3.8 on kedro-datasets (#442)

* Drop support for python 3.8 on kedro-datasets

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): add outputs to matplotlib doctests (#449)

* test(datasets): add outputs to matplotlib doctests

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Update Makefile

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Reformat code example, line length is short enough

* Update kedro-datasets/kedro_datasets/matplotlib/matplotlib_writer.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix more doctest issues  (#451)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): fix failing doctests in Windows CI (#457)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): fix accidental reference to NumPy (#450)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): don't pollute dev env in doctests (#452)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: Add tools to heap event (#430)

* Add add-on data to heap event

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Move addons logic to _get_project_property

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add condition for pyproject.toml

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* add tools to mock

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update tools test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add after_context_created tools test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update rename to tools

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-telemetry/tests/test_plugin.py

Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): install deps in single `pip install` (#454)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Bump s3fs (#463)

* Use mocking for AWS responses

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Add change to release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Update release notes

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Use pytest xfail instead of commenting out test

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

---------

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* test(datasets): make SQL dataset examples runnable (#455)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): correct pandas-gbq as py311 dependency (#460)

* update pandas-gbq dependency declaration

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>

* fix fmt

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>

---------

Signed-off-by: Onur Kuru <kuru.onur1@gmail.com>
Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Document `IncrementalDataset` (#468)

Document IncrementalDataset

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Update datasets to be arguments keyword only (#466)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Clean up code for old dataset syntax compatibility (#465)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: Update scikit-learn version (#469)

Update scikit-learn version

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): support versioning data partitions (#447)

* feat(datasets): support versioning data partitions

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Remove unused import

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* chore(datasets): use keyword arguments when needed

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Apply suggestions from code review

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Update kedro-datasets/kedro_datasets/partitions/partitioned_dataset.py

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

---------

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Improve documentation index (#428)

Rework documentation index

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): update wrong docstring about `con` (#461)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(datasets): Release `2.0.0`  (#472)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(telemetry): Pin `PyYAML` (#474)

Pin PyYaml

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(telemetry): Release 0.3.1 (#475)

Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(datasets): Fix broken links in README (#477)

Fix broken links in README

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): replace more "data_set" instances (#476)

Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix doctests (#488)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Fix delta + incremental dataset docstrings (#489)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(airflow): Post 0.19 cleanup (#478)

* bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Unbump version and clean test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Split big test into smaller tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update conftest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update conftest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix coverage

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try unpin airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* remove datacatalog step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Change node

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* update tasks test step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert to older airflow and constraint pendulum

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update template

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update message in e2e step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Final cleanup

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-airflow/pyproject.toml

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* Pin apache-airflow again

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build(airflow): Release 0.8.0 (#491)

Bump version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: telemetry metadata (#495)

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: Update tests on kedro-docker for 0.5.0 release. (#496)

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix test path for e2e tests

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix requirements path on dockerfiles

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Alter test for custom GID and UID

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert version bump to put in in separate PR

Signed-off-by: lrcouto <laurarccouto@gmail.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* build: Release kedro-docker 0.5.0 (#497)

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* bump version to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Lint

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update e2e tests to use new starters

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix test path for e2e tests

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix requirements path on dockerfiles

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* update tests to fit with current log format

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Remove redundant test

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Alter test for custom GID and UID

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Revert version bump to put in in separate PR

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Bump kedro-docker to 0.5.0

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Add release notes

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Update kedro-docker/RELEASE.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(datasets): Update partitioned dataset docstring (#502)

Update partitioned dataset docstring

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* Fix GeotiffDataset import + casing

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Fix lint

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Relax pandas.HDFDataSet dependencies which are broken on Windows (#426)

* Relax pandas.HDFDataSet dependencies which are broken on Window (#402)

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>

* Update RELEASE.md

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>

* Apply suggestions from code review

Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>

* Update kedro-datasets/setup.py

Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>

---------

Signed-off-by: Yolan Honoré-Rougé <yolan.honore.rouge@gmail.com>
Signed-off-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: airflow metadata (#498)

* Add example pipeline entry to metadata declaration

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix entry

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Make entries consistent

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add tools to config

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* fix: telemetry metadata (#495)

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Revert "Add tools to config"

This reverts commit 14732d772a3c2f4787063071a68fdf1512c93488.

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Quick fix

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Lint

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove outdated config key

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Use kedro new instead of cookiecutter

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

---------

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Co-authored-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore(airflow): Bump `apache-airflow` version (#511)

* Bump apache airflow

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Change starter

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e test steps

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update e2e test steps

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Unpin dask (#522)

* Unpin dask

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update doctest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update doctest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update kedro-datasets/setup.py

Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Add `MatlabDataset` to `kedro-datasets` (#515)

* Refork and commit kedro matlab datasets

Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg>

* Fix lint, add to docs

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fixing docstring

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fixing save

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Try fix docstest

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Fix unit tests

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update release notes:

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Not hardcode load mode

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: samuelleeshemen <samuel_lee_sj@aiap.sg>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(airflow): Pin `Flask-Session` version (#521)

* Restrict pendulum version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update airflow init step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Remove pendulum pin

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update create connections step

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Pin flask session

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add comment

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat: `kedro-airflow` group in memory nodes (#241)

* feat: option to group in-memory nodes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* fix: MemoryDataset

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/README.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/RELEASE.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/plugin.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/tests/test_node_grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/tests/test_node_grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* Update kedro-airflow/kedro_airflow/grouping.py

Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>

* fix: tests

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* Bump minimum kedro version

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* fixes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Signed-off-by: Simon Brugman <sbrugman@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(datasets): Update pyproject.toml to pin Kedro 0.19 for kedro-datasets (#526)

Update pyproject.toml

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(airflow): include environment name in DAG filename (#492)

* feat: include environment name in DAG file

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

* doc: add update to release notes

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>

---------

Signed-off-by: Simon Brugman <sfbbrugman@gmail.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Enable search-as-you type on Kedro-datasets docs (#532)

* done

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix lint

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

---------

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Debug and fix `kedro-datasets` nightly build failures (#541)

* pin deltalake

* Update kedro-datasets/setup.py

Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>

* Update setup.py

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* sort order and compare

* Update setup.py

* lint

* pin deltalake

* add comment to pin

---------

Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* feat(datasets): Dataset Preview Refactor  (#504)

* test

* done

* change from _preview to preview

* fix lint and tests

* added docstrings

* rtd fix

* rtd fix

* fix rtd

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix rtd

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix rtd - pls"

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* add nitpick ignore

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* test again

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* move tracking datasets to constant

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove comma

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove Newtype from json_dataset"

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* pls work

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* confirm rtd works locally

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* juanlu's fix

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* remove unnecessary stuff from conf.py

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fixes based on review

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* changes based on review

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* add suffix Preview

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* change img return type to bytes

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* fix tests

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>

* update release note

* fix lint

---------

Signed-off-by: rashidakanchwala <rashida_kanchwala@mckinsey.com>
Co-authored-by: ravi-kumar-pilla <ravi_kumar_pilla@mckinsey.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix(datasets): Drop pyarrow constraint when using snowpark (#538)

* Free pyarrow req

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>

* Free pyarrow req

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>

---------

Signed-off-by: Felipe Monroy <felipe.m02@gmail.com>
Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs: Update kedro-telemetry docs on which data is collected (#546)

* Update data being collected
---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Dmitry Sorokin <40151847+DimedS@users.noreply.github.com>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* ci(docker): Trying to fix e2e tests (#548)

* Pin psutil

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add no capture to test

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update pip version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update call

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update pip

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* pip ruamel

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* change pip v

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* change pip v

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* show stdout

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* use no cache dir

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* revert extra changes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* pin pip

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* gitpod

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* pip inside dockerfile

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* pip pip inside dockerfile

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* chore: bump actions versions (#539)

* Unpin pip and bump actions versions

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* remove version

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert unpinning of pip

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

---------

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* docs(telemetry): Direct readers to Kedro documentation for further information on telemetry (#555)

* Direct readers to Kedro documentation for further information on telemetry

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

* Wording improvements

Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>

* Amend README section

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>

---------

Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space>
Co-authored-by: Jo Stichbury <jo_stichbury@mckinsey.com>
Signed-off-by: tgoelles <thomas.goelles@gmail.com>

* fix: kedro-telemetry masking (#552)

* Fix masking

Signed-off-by: Dmitr…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Polars lazy evaluation
4 participants