Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blueprint autodeployments are failing #668

Closed
jlewi opened this issue May 15, 2020 · 8 comments
Closed

Blueprint autodeployments are failing #668

jlewi opened this issue May 15, 2020 · 8 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented May 15, 2020

Autodeployments of blueprints are failing.

Looking at the tekton dashboard.
https://kf-ci-v1.endpoints.kubeflow-ci.cloud.goog/tekton/#/namespaces/auto-deploy/pipelineruns?labelSelector=tekton.dev%2Fpipeline%3Ddeploy-gcp-blueprint

error is

INFO|2020-05-15T17:30:12|/workspace/testing-repo/py/kubeflow/testing/util.py|72| kpt pkg get https://github.com/kubeflow/manifests.git@master ./upstream/manifests
INFO|2020-05-15T17:30:13|/workspace/testing-repo/py/kubeflow/testing/util.py|72| fetching package / from https://github.com/kubeflow/manifests to upstream/manifests
INFO|2020-05-15T17:30:20|/workspace/testing-repo/py/kubeflow/testing/util.py|72| Error: upstream/manifests/aws/aws-istio-authz-adaptor/overlays/application/application.yaml: yaml: line 25: did not find expected '-' indicator
INFO|2020-05-15T17:30:20|/workspace/testing-repo/py/kubeflow/testing/util.py|72| Makefile:36: recipe for target 'get-pkg' failed
INFO|2020-05-15T17:30:20|/workspace/testing-repo/py/kubeflow/testing/util.py|72| make: *** [get-pkg] Error 1
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/bug 0.95
area/engprod 0.57

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented May 16, 2020

It looks like the problem is the application resource.
https://github.com/kubeflow/manifests/blob/d82342c88cb943635d6842db3153f9909181a067/aws/nvidia-device-plugin/overlays/application/application.yaml#L23

It looks like the YAML spec is invalid. The indentation of OWNERs is wrong.

It looks like the validation I'm writing to fix kubeflow/manifests#1174 is catching this.

@Jeffwan I will fix the YAML as part of kubeflow/manifests#1174 so no need to worry about this unless you need a fix sooner. Should have a PR this weekend or Monday.

jlewi pushed a commit to jlewi/manifests that referenced this issue May 16, 2020
* kubeflow#1174 lots of CRDs have status fields which
  causes ACM to complain that the CRD is invalid.

* This PR cleans up those CRDs and adds an appropriate validation test.

* Fix typos in AWS application specs (kubeflow/testing#668)
@Jeffwan
Copy link
Member

Jeffwan commented May 17, 2020

Thanks @jlewi for the fix. I think I carelessly brings some indent issue in kubeflow/manifests#1162

k8s-ci-robot pushed a commit to kubeflow/manifests that referenced this issue May 18, 2020
* CRDs should not have status field.

* #1174 lots of CRDs have status fields which
  causes ACM to complain that the CRD is invalid.

* This PR cleans up those CRDs and adds an appropriate validation test.

* Fix typos in AWS application specs (kubeflow/testing#668)

* Update tests.
@jlewi
Copy link
Contributor Author

jlewi commented May 20, 2020

@Jeffwan not a problem

@jlewi
Copy link
Contributor Author

jlewi commented May 20, 2020

Auto deployed blueprints are still failing with an rbac issue.

Error from server (Forbidden): error when creating ".build/gcp_config/iam.cnrm.cloud.google.com_v1beta1_iamserviceaccount_kf-vbp-0520-61c-vm.yaml": iamserviceaccounts.iam.cnrm.cloud.google.com is forbidden: User "kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com" cannot create resource "iamserviceaccounts" in API group "iam.cnrm.cloud.google.com" in the namespace "kubeflow-ci-deployment": requires one of ["container.thirdPartyObjects.create"] permission(s)

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
platform/gcp 0.68

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi
Copy link
Contributor Author

jlewi commented May 20, 2020

kubeflow-ci-deployment namespace is missing

kubectl --context=kf-ci-deployment-management create namespace kubeflow-ci-deployment
kubectl --context=kf-ci-deployment-management -n kubeflow-ci-deployment create rolebinding kf-ci-v1-cnrm-admin --user=kf-ci-v1-user@kubeflow-ci.iam.gserviceaccount.com --clusterrole=cnrm-admin

@jlewi
Copy link
Contributor Author

jlewi commented May 20, 2020

@jlewi jlewi closed this as completed May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants