-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does it proceed with this project? #1
Comments
@vicaire is leading this project. |
Hi ynqa@, we plan to provide some tools built on top of Argo to schedule and trigger pipelines. Argo has an ambitious project to tackle this (https://github.com/argoproj/argo-events) and hopefully we can switch to what Argo provides once it has taken shape. |
@vicaire Thanks for your reply. that means this repo will include the tools which are wrapped Argo trigger/scheduler and specialized in kubeflow, not Argo itself? |
Yes. We are still figuring out the details. The goal is to provide pipeline functionality that either doesn't not make sense to add to Argo, or that won't be available in Argo for a while. We are still figuring out the details and will discuss further with the Argo team. |
@vicaire I got it, thanks. If you decided what to make on this projects or there is anything that I can help with, please tell here. I'm going to cooperate argo projects progresses also. |
That sounds great. Thanks! |
Hi, I am also interested in the project, could we add the scope of the project in README? Personally, I am not sure if I understand it. Do you want to implement the pipeline based on argo to support ML workload? |
Hi gaocegege, we are still figuring out the details and scope. Stay tuned. Thanks! |
From what I explored so far, the project now supports scheduled workflow on top of argo, and the main use case i've learned is to support training model on a regular basis, as mentioned here. While @vicaire is working on the details, I'm curious about the regular training use case, what's the concrete scenario we are targeting? I don't quite understand this :) For us, we have a use case to train a model, save the model somewhere (version controlled) then serve the model; preferably, the model can be validated and evaluated automatically, providing feedback for another round of training (not necessarily auto-start by a scheduled workflow). Is this a use case within our scope? |
Model validation and evaluation are within scope of Kubeflow and TFX but I think the initial work in this repo is more narrowly scoped; i.e ScheduledWorkflow. @ddysher I think we as a community should figure out how to address use cases like the ones you mentioned. There are lots of issues about continuously deploying models e.g. using GitOps kubeflow/kubeflow#1118 and there are lot of pieces e.g (Argo, Weave, Seldon, etc...) |
Could you please explain what ScheduledWorkflow is, I can not find the proposal or something else in the community, thanks. 😄 |
@gaocegege ScheduleWorkflow is the CRD in this repo. Definition |
@ddysher You are correct that one of the main goal of this project is to run an argo workflow on a regular basis by with the help of ScheduledWorkflow CRD. |
@jlewi @IronPan thanks for the kind reply. I understand argo workflow and how we leverage it for kubeflow. My question above is why do we want a ScheduledWorkflow for kubeflow, what is the concrete use case? From what i understand, regular training won't be useful unless we have feedback from somewhere. |
@ddysher Thanks for clarifying the question. Argo was initially created for K8s native CI/CD. The pipeline project is a nature extension of the idea that applies to the case of ML model development. One of the main use case for ScheduledWorkflow CRD is continuous learning. It's not rare, especially in production environment, to see model continuous adapting the new data coming in, and continuously evolving on a regular basis(daily/weekly/monthly). Initiating the process manually would be a repetitive and tedious work that naturally make sense to be automated. Ideally, the system would fully automate the workflow that does data ingesting, training, model analysis, and deployment. |
Hi @ddysher, We do not have an estimated time yet for a proposal but will update the thread as soon as we do. Thanks! |
It's interested project to me as well. |
Thanks fisache@. We haven't yet established the exact scope of this project. For now, it only aims to provide a set of useful tools for running workflows. We will provide more information as we make progress. Thanks. |
* openscale componenets to manage models * renamed component filename, updated with component.yaml (#1)
* openscale componenets to manage models * renamed component filename, updated with component.yaml (#1)
Separate RM and SM execution roles
* Changes: * add kfp v2 hello world sample * generate go proto code * code to unmarshal via jsonpb * generate container spec * add root DAG task and Makefile to run in argo * env setup to quickly build driver container * generate mlmd grpc golang client * mlmd grpc client usage example * driver-dag: put execution and context * driver dag: put context only when execution is a DAG * driver task: resolve parameter from parent execution * chain dag driver with task driver * driver: output argo parameters * driver: build driver docker image * driver: push image to dev registry * compiler: root dag driver compiler * mlmd doc * driver: use task spec and executor spec as driver input instead * driver: pod spec patch output for type EXECUTOR * compiler: passing - parameter placeholder with hello world example * include generated argo workflow in source control * driver no longer outputs parameters * publisher: publish parameters to execution custom properties * driver, publisher: execution state * publisher: remove built binary from repo * sample: add producer consumer sample * sample: producer consumer sample with only parameters * e2e output parameter support * e2e: driver resolves input parameter from tasks in the same DAG * compiler: convert KFP task dependency to argo task dependency * feat: refactor publisher so it can run as an entrypoint * build: entrypoint image * feat: executor and publisher in one container via entrypoint rewriting * fixed compile error (#1) * add licenses * update readme Co-authored-by: capri-xiyue <52932582+capri-xiyue@users.noreply.github.com>
…10318) (kubeflow#10319) * feat: preserve querystring in pipeline root * refactor: create AppendToPipelineRoot Also apply to client.go * feat: remove query string from URIs (kubeflow#1) * feat: remove query string from URIs * refactor(GenerateOutputURI): move and preserve comments
I'd like to contribute/help to this project, are there any milestones or actual codes for pipelines?
It looks like that there are two procedures:
please reply if you don't mind.
The text was updated successfully, but these errors were encountered: