-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate code snippets in datasets documentation - Part 1 #1962
Conversation
@@ -28,17 +28,13 @@ class JSONDataSet(AbstractVersionedDataSet[Any, Any]): | |||
>>> json_dataset: | |||
>>> type: json.JSONDataSet | |||
>>> filepath: data/01_raw/location.json | |||
>>> load_args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON Dataset does not support the load_args
parameter: https://kedro.readthedocs.io/en/stable/kedro.extras.datasets.json.JSONDataSet.html#kedro.extras.datasets.json.JSONDataSet.__init__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!
* Update release version and release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Update missing release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * update vresion Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * update release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Fix code snippets Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Separate code blocks Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> * Lint Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: jstammers <jimmy.stammers@cgastrategy.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Update RELEASE.md * fix broken link * Update RELEASE.md Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Deprecating `kedro test` and `kedro lint` Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Deprecate commands Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Make kedro looks prettier * Update Linting Signed-off-by: Nok <nok_lam_chan@mckinsey.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok <nok_lam_chan@mckinsey.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Florian Gaudin-Delrieu <florian.gaudindelrieu@gmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
…related issues better (#1881) * Update message for VersionNotFoundError Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> * Add test for VersionNotFoundError for cloud protocols * Update test_data_catalog.py Update NoVersionFoundError test * minor linting update * update docs link + styling changes * Revert "update docs link + styling changes" This reverts commit 6088e00. * Update test with styling changes * Update RELEASE.md Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com> Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Add best_practices.md with introductory sections Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pytest and pytest-cov sections Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pytest-cov coverage report Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add sections on pytest-cov Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add automated_testing to index.rst Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Reformat third-party library names and clean grammar. Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add link to virtual environment docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add example of good test naming Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Improve link accessibility Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Improve pytest docs link accessibility Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add reminder link to virtual environment docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix formatting in link to coverage docs Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove reference to /src under 'Run your tests' Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Modify references to <project_name> to <package_name> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix sentence structure Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix broken databricks doc link Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Capitalised kedro-viz Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> * capitalised Kedro viz Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> * Updated set_up_experiment_tracking.md Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Carla Vieira <carlaprv@hotmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Change a file extention to match the previous article Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Add a missing import Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Change both preprocessed datasets to parquet files Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Change data type to ParquetDataSet for parquet files Signed-off-by: dinotuku <kuan.tung@epfl.ch> * Add a note for installing seaborn if it is not installed Signed-off-by: dinotuku <kuan.tung@epfl.ch> Signed-off-by: dinotuku <kuan.tung@epfl.ch> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Add documentation for linting tools Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert changes to commands_reference.md Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update linting docs with suggestions Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update linting doc Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Create dependabot.yml configuration file * Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * add target-branch Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * limit dependabot to just dependency folder Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update test_requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update MANIFEST.in Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * fix e2e Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update continue_config.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Update requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * fix link Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * revert Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * Delete requirements.txt Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Update dependabot.yml Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * pin jupyterlab_services to requirments Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> * lint Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Updates the requirements on [pip-tools](https://github.com/jazzband/pip-tools) to permit the latest version. - [Release notes](https://github.com/jazzband/pip-tools/releases) - [Changelog](https://github.com/jazzband/pip-tools/blob/master/CHANGELOG.md) - [Commits](jazzband/pip-tools@6.5.0...6.9.0) --- updated-dependencies: - dependency-name: pip-tools dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Updates the requirements on [toposort]() to permit the latest version. --- updated-dependencies: - dependency-name: toposort dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
…1953) Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* remove a redundant function call Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Remove redundant resolove_load_version & fix test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Fix HoloviewWriter tests with more specific error message pattern & Lint Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Rename tests Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
37dc351
to
011b5bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on going through all these examples and testing them! 👍 ⭐
@@ -28,17 +28,13 @@ class JSONDataSet(AbstractVersionedDataSet[Any, Any]): | |||
>>> json_dataset: | |||
>>> type: json.JSONDataSet | |||
>>> filepath: data/01_raw/location.json | |||
>>> load_args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!
setup.py
Outdated
@@ -78,6 +78,7 @@ def _collect_requirements(requires): | |||
"pandas.XMLDataSet": [PANDAS, "lxml~=4.6"], | |||
"pandas.GenericDataSet": [PANDAS], | |||
} | |||
pickle_require = {"pickle.PickleDataSet": ["compress-pickle~=2.1.0"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original comment said: "# Add "compress_pickle[lz4]" to requirements.txt" Do we need to specify [lz4] for this requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - this is what the pip list
showed after manually installing compress_pickle[lz4]
, I'll make the change to follow the dask example further up
@MerelTheisenQB
I took a look at the ref but couldn't see the preference; is it the name that they prefer? As the implementation is carbon copies 😬 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
The difference between Perhaps the question is whether our users make use of both the datasets or not. It's not an extreme maintenance burden so I'd just keep them both. |
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Description
This follows on from #2287. In this PR I have tested all the datasets that can be instantiated locally. The following datasets remain to be tested in part 2:
The main changes in this PR are as follows:
extras_require
insetup.py
Also warranting a flag 🚩: GraphMLDataSet and GMLDataSet appear to be identical in purpose and in implementation
Development notes
Both the YAML and Python examples have been tested under the following setup:
pip install kedro
pip install kedro[all]
For Python:
kedro jupyter lab
Code snippets were copied and pasted over
An additional cell was added to check
<dataset name>.exists()
returned TrueFor YAML:
kedro new -s spaceflights
The dataset was added to the data catalog in
conf/base/catalog.yml
and added to the pipeline arbitrarily.Checklist
RELEASE.md
file