-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the data used in the spaceflights
tutorial and starters
#3109
Comments
Now that we're reviewing the spaceflights data, I transferred #3110 to this repo |
@merelcht You mean literally making the dataset smaller? |
Yes 🙂 It's just an example project to demonstrate how Kedro works, so we don't care much about model accuracy. This ticket is to try and reduce the size of all input datasets: |
This should also be adressed at I saw the
To check if my understanding is correct, should changes be made to these And the problem is related only to |
Yes this should indeed be done in
I'd suggest doing the change first for just
Ideally all the datasets would be reduced in size if that's possible 🙂 |
Hi @laizaparizotto Is this a ticket you are actively working on? Shall I assign it to you and mark as "in progress"? |
Hi @stichbury, I will be able to work on it this weekend. If not a problem, yes, you can assign that to me :D. |
I left a comment in the PR asking if I should also open a PR in this repo? |
Description
The spaceflights starter gets used in demos and testing, it takes a considerate of time to run the pipeline. For example in the Kedro bootcamp demoing
catalog.load("shuttles")
takes like 15-20 seconds and is a bit awkward for demo purpose.Context
See more details on the spaceflights project + tutorial in the docs: https://docs.kedro.org/en/stable/tutorial/spaceflights_tutorial.html
We're now creating all new starters based on spaceflights so all of these examples could benefit from a smaller dataset.
Related to: #2008
The text was updated successfully, but these errors were encountered: