-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mechanism for programmatically updating datasets #188
Comments
@barkerje thank you for your message. You could have a look at using a
Also, could I suggest you to post such kind of "how to" questions on Stackoverflow (and tag it with kedro)? Since other users may also benefit from the answers and, arguably, SO is more suitable for Q&A. |
Hi @barkerje if you're happy with the reply above, please consider closing this issue. :) |
Hi @barkerje! We hope that you got sufficient help from @DmitriiDeriabinQB on this issue. I'm going to be closing this issue. Let us know if you have any more thoughts by either commenting on this or creating a new issue. |
Related to the proposal in #341 |
Description
I would like to have an iterative pipeline where a dataset,
train_data
, gets read in at the beginning of a pipeline and new an updated version of the sametrain_data
dataset is written at the end of the pipeline. I was trying to do this using a versioned dataset so I can track my progress after each iteration. However, when I try to implement this I get:Context
I am implementing an active learning pipeline with the following (simplified) workflow:
train_data
and use to train a model.infer_data
and get annotator to provide newinfer_labels
.train_data
to includeinfer_data
andinfer_labels
.At present, I cannot figure out any way to do this in the kedro framework due to the restrictions placed on the pipelines. How would you suggest implementing something like this pipeline?
The text was updated successfully, but these errors were encountered: