-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kedro-datasets ] Add Polars.CSVDataSet
#95
Conversation
I'm figuring it out how to solve the unit test that is failing. |
Right now all unit tests are passing, it's just the DCO check that's failing. You can find instructions here https://github.com/kedro-org/kedro-plugins/pull/95/checks?check_run_id=10658463390 on how to solve it. |
Thanks, @merelcht ! My e-mail signoff was wrong, ive commited the fix now. Do you think its valid to finish this PR only with CSVDataSet? |
@merelcht , Ive followed the intructions and rewrote my signoff email, but still DCO failing for the same reason. Dont know whats wrong now. |
Hmm odd.. sometimes it's very hard to get it working 😅 I can manually approve it when the PR is ready to merge. So don't worry about it. |
Yes I think that's fine! Any additional dataset is valuable 🙂 In the future you (or another contributor) can then build more polar datasets. |
Polars.CSVDataSet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the addition @wmoreiraa !
I've added some comments mainly around updating of the docstring. And could you also add this change to the release notes? 🙂
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: wmoreiraa <walber3@gmail.com>
Signed-off-by: wmoreiraa <walber3@gmail.com>
* Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: wmoreiraa <walber3@gmail.com>
I think everything's fine now? @merelcht |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @wmoreiraa. I left a couple more questions, but most importantly you shouldn't bump the version of kedro-datasets
in this PR. We'll do that when we do a release.
Signed-off-by: wmoreiraa <walber3@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution @wmoreiraa 😄 🎉
Add dataset headers Signed-off by: wmoreiraa <walber3@gmail.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Just the first one! I've been impacted by layoffs, and now I'm full time job hunting / available, might as well do some LinkedIn post showcasing some polars + kedro. |
I'm so sorry to hear that! Good luck with the job hunting 🍀 Your contributions on Kedro are very much appreciated ❤️ |
Good luck @wmoreiraa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @wmoreiraa! 🌟 Happy to approve with just the minor change noted.
I love this and want to see it in Kedro as soon as possible. My only question is if we shouldn't approach this on a file type by file type basis and should in fact approach this the same way we do At the time of writing the following
In terms of write targets the DataFrame class only supports a subset of these:
Is there any merit in abstracting this into a generic approach? |
Thank you for the fix! @AhdraMeraliQB . |
I'd suggest getting this merged in first since it's ready now and then look at a generic approach later. |
@merelcht @datajoely, I've finished the test cases on the Generic Approach. Should I open a second PR then? |
As merel said let's get this one in and then do a follow-up? |
Fine for me, ima just wait and then fork again. |
Thank you! I've just resolved some minor merge conflicts and will merge this PR as soon as the builds finish successfully 🙂 |
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com>
* [kedro-docker] Layers size optimization (#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from #98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
* [kedro-docker] Layers size optimization (kedro-org#92) * [kedro-docker] Layers size optimization Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Adjust test requirements Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Skip coverage check on tests dir (some do not execute on Windows) Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Update .coveragerc with the setup Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fix bandit so it does not scan kedro-datasets Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Fixed existence test Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Check why dir is not created Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Kedro starters are fixed now Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Increased no-output-timeout for long spark image build Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> * Spark image optimized Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Linting Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Switch to slim image always Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Trigger build Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Use textwrap.dedent for nicer indentation Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Use textwrap.dedent for nicer indentation" This reverts commit 3a1e3f8. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Revert "Revert "Use textwrap.dedent for nicer indentation"" This reverts commit d322d35. Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> * Make tests read more lines (to skip all deprecation warnings) Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release Kedro-Docker 0.3.1 (kedro-org#94) * Add release notes for kedro-docker 0.3.1 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update version in kedro_docker module Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump version and update release notes (kedro-org#96) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Make the SQLQueryDataSet compatible with mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one test + update RELEASE.md. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing pyodbc for tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Mock connection as well. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add more dates parsing for mssql backend (thanks to fgaudindelrieu@idmog.com) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix an error in docstring of MetricsDataSet (kedro-org#98) Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump relax pyarrow version to work the same way as Pandas (kedro-org#100) * Bump relax pyarrow version to work the same way as Pandas We only use PyArrow for `pandas.ParquetDataSet` as such I suggest we keep our versions pinned to the same range as [Pandas does](https://github.com/pandas-dev/pandas/blob/96fc51f5ec678394373e2c779ccff37ddb966e75/pyproject.toml#L100) for the same reason. As such I suggest we remove the upper bound as we have users requesting later versions in [support channels](https://kedro-org.slack.com/archives/C03RKP2LW64/p1674040509133529) * Updated release notes Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add missing type in catalog example. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add one more unit tests for adapt_mssql. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Add missing mocker from date test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [TEST] Add a wrong input test. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Add pyodbc dependency. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Remove dict() in tests. Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Change check to check on plugin name (kedro-org#103) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Set coverage in pyproject.toml (kedro-org#105) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Move coverage settings to pyproject.toml (kedro-org#106) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Replace kedro.pipeline with modular_pipeline.pipeline factory (kedro-org#99) * Add non-spark related test changes Replace kedro.pipeline.Pipeline with kedro.pipeline.modular_pipeline.pipeline factory. This is for symmetry with changes made to the main kedro library. Signed-off-by: Adam Farley <adamfrly@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix outdated links in Kedro Datasets (kedro-org#111) * fix links * fix dill links Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Fix docs formatting and phrasing for some datasets (kedro-org#107) * Fix docs formatting and phrasing for some datasets Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Manually fix files not resolved with patch command Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * Apply fix from kedro-org#98 Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Release `kedro-datasets` `version 1.0.2` (kedro-org#112) * bump version and update release notes * fix pylint errors Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Bump pytest to 7.2 (kedro-org#113) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Prefix Docker plugin name with "Kedro-" in usage message (kedro-org#57) * Prefix Docker plugin name with "Kedro-" in usage message Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (kedro-org#56) * Keep Kedro-Docker plugin docstring from appearing in `kedro -h` Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [kedro-datasets ] Add `Polars.CSVDataSet` (kedro-org#95) Signed-off-by: wmoreiraa <walber3@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (kedro-org#54) Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Yassine Alouini <yalouini@idmog.com> * [FIX] Fix ds to data_set. Signed-off-by: Yassine Alouini <yalouini@idmog.com> --------- Signed-off-by: Mariusz Strzelecki <mariusz.strzelecki@getindata.com> Signed-off-by: Mariusz Strzelecki <szczeles@gmail.com> Signed-off-by: Yassine Alouini <yalouini@idmog.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: OKA Naoya <pn11@users.noreply.github.com> Co-authored-by: Joel <35801847+datajoely@users.noreply.github.com> Co-authored-by: adamfrly <45516720+adamfrly@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Walber Moreira <58264877+wmoreiraa@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>
Most up to date effort: gh-170. |
Signed-off-by: wmoreiraa walber3@gmail.com
Description
Introduce python-polars to Kedro Datasets.
https://www.pola.rs/benchmarks.html
Development notes
TODO on this PR:
All of those using only the eager I/O.
Checklist
RELEASE.md
file