You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After yesterdays release of kedro-datasets==1.5.0, our CI started failing during system tests which do a kedro run for a pipeline with spark (see the screenshot). As far as i can see, SparkDataSet is still defined with the same name as before. When we used kedro-datasets==1.4.2 the same tests were running smoothly. I also couldn't find anything specific in the release notes.
Context
How has this bug affected you? What were you trying to accomplish?
Run a pipeline, where kedro-datasets[spark.SparkDataSet] is used
[Second Step]
[And so on...]
Expected Result
Tell us what should happen.
The pipeline should run successfully till the end
Actual Result
Tell us what happens instead.

-- Separate them if you have more than one.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
Kedro version used (pip show kedro or kedro -V): 0.18.11
Kedro plugin and kedro plugin version used (pip show kedro-airflow): kedro-datasets==1.5.0
Python version used (python -V): 3.8
Operating system and version: ubuntu-2004:202201-02
The text was updated successfully, but these errors were encountered:
ElenaMironovaQB
changed the title
<Title> kedro-datasets release 1.5.0 doesn't reflect SparkDataSet well
kedro-datasets release 1.5.0 doesn't reflect SparkDataSet well
Aug 2, 2023
@sbrugman is pip install kedro-datasets[pandas.CSVDataSet] still possible? I think this is an undesire side-effect. I did some quick search and seem that the standard pyproject.toml doesn't support pip install kedor-datasets[pandas.CSVDataSet] but only pip install kedro-datasets[pandas].
At this point I don't think we want to bring in more advance tool like poetry just for this.
Description
Short description of the problem here.
After yesterdays release of
kedro-datasets==1.5.0
, our CI started failing during system tests which do akedro run
for a pipeline with spark (see the screenshot). As far as i can see, SparkDataSet is still defined with the same name as before. When we usedkedro-datasets==1.4.2
the same tests were running smoothly. I also couldn't find anything specific in the release notes.Context
How has this bug affected you? What were you trying to accomplish?
Our system tests which run kedro on pipelines with spark stopped running.
More discussion on slack: https://kedro-org.slack.com/archives/C03RKP2LW64/p1690896281915309
Steps to Reproduce
Expected Result
Tell us what should happen.
The pipeline should run successfully till the end
Actual Result
Tell us what happens instead.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
pip show kedro
orkedro -V
): 0.18.11pip show kedro-airflow
): kedro-datasets==1.5.0python -V
): 3.8The text was updated successfully, but these errors were encountered: