-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dataproc component yaml files #956
Conversation
/retest |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @hongye-sun - i am trying to understand how are these YAML files for components description being used in the overall pipelines system?
@animeshsingh We want to use those yaml file to share component across pipelines. Basically, the pipeline author should be able to load a component by yaml file. The descriptions in the yaml are served as documentation for the loaded component. Here is an example on how to use it in a notebook: https://github.com/kubeflow/pipelines/tree/master/components/gcp/bigquery/query. It's still in early state and format in the yaml are going to be changed in the future. E.g. it will be extended to support DAG and other types of resources. |
"It's still in early state and format in the yaml are going to be changed in the future. E.g. it will be extended to support DAG and other types of resources." - if we support DAG here, wouldnt it start going in the same territory as Argo yaml? |
True. We are likely to replace the implementation section in the yaml with argo spec here and will keep the inputs and outputs metadata for describing the documentation and type information. Ideally, the load component api should be able to load any compiled pipeline yaml as a DAG component. |
/lgtm |
The component.yaml files are needed for efficient component sharing. Currently many pipeline authors just copy/paste the code between the pipeline files which is an anti-pattern and is error-prone. It's much easier to just write |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hongye-sun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hongye-sun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks. |
* Add dataproc component yaml files * Update license to 2019 * Remove unused parameter
* Create PRESENTATIONS.md * hyperlink from main README
* [test] tryout kind on github Signed-off-by: Yihong Wang <yh.wang@ibm.com> * build images build and use the images inside the kind cluster Signed-off-by: Yihong Wang <yh.wang@ibm.com> * remove unnecessary step Signed-off-by: Yihong Wang <yh.wang@ibm.com> * build multiple images in a script Signed-off-by: Yihong Wang <yh.wang@ibm.com> * check if any change for backend files check changes for backend files and trigger the integration testing if any. Signed-off-by: Yihong Wang <yh.wang@ibm.com>
This change is