-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile samples instead of hard code them in API server #76
Conversation
…into yangpa/samples
@@ -14,6 +14,15 @@ COPY . . | |||
RUN apk add --update gcc musl-dev | |||
RUN go build -o /bin/apiserver backend/src/apiserver/*.go | |||
|
|||
FROM python:3.5.0-slim as compiler | |||
|
|||
# This is hard coded to 0.0.26. Once kfp DSK release process is automated, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already in (#70). Give it a try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test need to update to adopt that. it's kind of late for the last minute change. I'll do it post beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
"file":"/samples/tfx/taxi-cab-classification-pipeline.py.tar.gz" | ||
}, | ||
{ | ||
"name":"[Sample] ML - CMLE - Resnet Training", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to remove this one --- CMLE is not ready for TPU yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
backend/Dockerfile
Outdated
RUN pip install https://storage.googleapis.com/ml-pipeline/release/0.0.26/kfp-0.0.26.tar.gz --upgrade | ||
WORKDIR /samples | ||
COPY ./samples . | ||
RUN find . -maxdepth 2 -name "*.py" -exec python3 {} \; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is more robust to call dsl-compile on these py files --- Not all py files come with self-compilation code. It is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
backend/src/apiserver/main.go
Outdated
@@ -132,16 +132,26 @@ func loadSamples(resourceManager *resource.ResourceManager) { | |||
} | |||
var configs []config | |||
if json.Unmarshal(configBytes, &configs) != nil { | |||
glog.Warningf("Failed to read sample configurations. Err: %v", err.Error()) | |||
return | |||
glog.Fatalf("Failed to read sample configurations. Err: %v", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please return errors in this method and have the top level decide whether to abort? (i.e. let's avoid functions with side effects as it makes code hard to reuse).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
backend/src/apiserver/main.go
Outdated
} | ||
resourceManager.CreatePipeline(config.Name, config.Description, sampleBytes) | ||
// Decompress if file is tarball |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to test whether it is a tarball or not?
Why not reusing the logic in the API that already turns a file (yaml or tarball) into a pipeline object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -14,6 +14,15 @@ COPY . . | |||
RUN apk add --update gcc musl-dev | |||
RUN go build -o /bin/apiserver backend/src/apiserver/*.go | |||
|
|||
FROM python:3.5.0-slim as compiler | |||
|
|||
# This is hard coded to 0.0.26. Once kfp DSK release process is automated, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
/lgtm |
backend/src/apiserver/main.go
Outdated
@@ -132,16 +138,26 @@ func loadSamples(resourceManager *resource.ResourceManager) { | |||
} | |||
var configs []config | |||
if json.Unmarshal(configBytes, &configs) != nil { | |||
glog.Warningf("Failed to read sample configurations. Err: %v", err.Error()) | |||
return | |||
return errors.New(fmt.Sprintf("Failed to parse sample configurations.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not returning the original error message as part of the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/presubmit-tests.sh
Outdated
@@ -72,7 +72,7 @@ if [ "$CLUSTER_TYPE" == "create-gke" ]; then | |||
echo "Delete cluster..." | |||
gcloud container clusters delete ${TEST_CLUSTER} --async | |||
} | |||
trap delete_cluster EXIT | |||
# trap delete_cluster EXIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for debugging. revert
loadSamples(resourceManager) | ||
err:= loadSamples(resourceManager) | ||
if err!=nil{ | ||
glog.Fatalf("Failed to load samples. Err: %v", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the server restarts?
Does this prevent the server from ever restarting unless the storage is deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the samples are baked into the image so it won't be a problem
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: IronPan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
@IronPan: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1 similar comment
@IronPan: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
@IronPan: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1 similar comment
@IronPan: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
* Moves docs from the /samples README to the wiki I've copied all the info from the googleprivate/samples README.md to the wiki. This PR removes most of the content from the README, adds a simple introduction, and points people to the wiki docs for further information. Relevant wiki files: https://github.com/kubeflow/pipelines/wiki/Deploy-the-Kubeflow-Pipelines-Service https://github.com/kubeflow/pipelines/wiki/Build-a-Pipeline https://github.com/kubeflow/pipelines/wiki/Build-Your-Own-Component * SDK/Tests/Components - Improve temporary file handling (#37) * Improve temporary file handling in python op tests * More small temp path fixes * Fix tfx name bug in the tfma sample test (#67) * fix tfx name bug * update release build for the data publish * SDK/DSL/Compiler - Fixed compilation of dsl.Condition (#28) * Fixed compilation of dsl.Conditional The compiler no longer produced intermediate steps. * Got rid of _create_new_groups * Changed the sub_group.type check * Fix tfx name bug in the tfma sample test (#67) * fix tfx name bug * update release build for the data publish * Build Python SDK in the releasing (#70) * publish python sdk in the cloud build * update cloudbuild * adjust output path * add dependency for buildapiserver * mlp -> kfp.dsl (#88) * Removed instruction to clone the repo. * Changed link to samples doc on wiki. * Add %%docker magic to jupyter kernel. It helps submitting a docker build job more easily with one cell. (#72) * fix miscellaneous List API issue (#90) * fix * Update job_store.go * Update run_store.go * Update presubmit-tests.sh * Update job_store.go * Moves docs from pipelines main README to wiki (#83) I've copied all the info from the kubeflow/pipelines README.md to the wiki. This PR removes most of the content from the README, adds a simple introduction, and points people to the wiki docs for further information. Relevant wiki files: https://github.com/kubeflow/pipelines/wiki https://github.com/kubeflow/pipelines/wiki/Deploy-the-Kubeflow-Pipelines-Service https://github.com/kubeflow/pipelines/wiki/Build-a-Pipeline * SDK/DSL/Compiler - Reverted fix of dsl.Condition until the UI is ready. (#94) * sort by run display name by default (#96) * CSS changes for nav menu and tables (#99) * debug tfma failure (#91) * debug tfma failure * tft version bug * minor fix * comment the test validation * Fix validation check for maximum size limit (#104) * Fixed the Minikube tests after moving to the new repo (#98) * Don't barf when experiment name is already used (#101) * don't barf when experiment name is already used * make mock backend use case insensitive names * ExperimentList tests, use immer.js (#86) * experiment list tests * use produce in more places * remove extra test * remove extra import * Use the experiment's resource reference in the listJobs request (#105) * use the experiment's resource reference in the listJobs request * fix test import * Add Ning and Alexey to OWNERS for components, samples and sample-test (#102) * Compile samples instead of hard code them in API server (#76) * compile samples * update logging * update description * update sample * add immediate value sample * revert * fail fast if the samples are failed to load * comment * address comments * comment out * update command * comments * Account for padding in metric progress fill (#107) * Account for padding in metric progress fill * small mock backend fix * move to css classes, add color * changes to breadcrumb style * increase width of summary card * tests * merge tests * First integration test for the ML Pipeline CLI (Pipeline List). (#81) * First integration test for the ML Pipeline CLI (Pipeline List). * Fixing an issue with an undefined variable * Adding the --debug flag to help with debugging. * Changing the namespace to Kubeflow. * Add tests for the NewExperiment page (#109) * Add tests for the NewExperiment page * Fix test name * Remove obsolete tests * Clean up * add xgboost: migrate from the old repo (#46) * migrate from the old repo * fix bug: accidentally override tfma test * add tfma test back * add tfma back * typo fix * fix small typo * if job fails, exit after logs are output * Remove CMLE sample for now since we are waiting for a service fix to support TPU. (#113) * image tag update for release (#114) * update image tag for new releases * add more * delete the accidentaly added sample * fix typo (#116) * Fix an issue that %%docker doesn't work. (#119) * updated favicon to monochrome color (#118) * Expanded row changes (#120) * updated favicon to monochrome color * simple CSS changes to expanded row * Removed mentions of ark7 in tests (#111) * Add basic sample tests (#79) * add sequential sample test * add condition basic sample * reuse script * add all the other basic tests * update sample test dockerfile to add run_basic_test file * write test output * typo bug
* Add ssh for minikube testing. * We use ssh to execute code on the VM used for minikube. * Log the file used to configure kubectl.
Include permission for get services
UPSTREAM: <carry>: Add the upstream code rebase document
Please ignore the sample names in the sample configs since they are still TBD.
This change is