-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ambassador a dependency #445
Conversation
fb78fa4
to
e696eac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if we need to change/remove the RBAC
seldon-core/helm-charts/seldon-core/templates/rbac.yaml
Lines 103 to 169 in 4d26ff0
{{- if .Values.ambassador.enabled }} | |
{{- if .Values.single_namespace }} | |
--- | |
apiVersion: rbac.authorization.k8s.io/v1 | |
kind: Role | |
metadata: | |
name: ambassador | |
rules: | |
- apiGroups: [""] | |
resources: | |
- services | |
verbs: ["get", "list", "watch"] | |
- apiGroups: [""] | |
resources: | |
- configmaps | |
verbs: ["create", "update", "patch", "get", "list", "watch"] | |
- apiGroups: [""] | |
resources: | |
- secrets | |
verbs: ["get", "list", "watch"] | |
--- | |
apiVersion: rbac.authorization.k8s.io/v1 | |
kind: RoleBinding | |
metadata: | |
name: ambassador | |
roleRef: | |
apiGroup: rbac.authorization.k8s.io | |
kind: Role | |
name: ambassador | |
subjects: | |
- kind: ServiceAccount | |
name: {{ .Values.rbac.service_account.name }} | |
namespace: {{ .Release.Namespace }} | |
{{- end }} | |
{{- if not .Values.single_namespace }} | |
--- | |
apiVersion: rbac.authorization.k8s.io/v1 | |
kind: ClusterRole | |
metadata: | |
name: ambassador | |
rules: | |
- apiGroups: [""] | |
resources: | |
- services | |
verbs: ["get", "list", "watch"] | |
- apiGroups: [""] | |
resources: | |
- configmaps | |
verbs: ["create", "update", "patch", "get", "list", "watch"] | |
- apiGroups: [""] | |
resources: | |
- secrets | |
verbs: ["get", "list", "watch"] | |
--- | |
apiVersion: rbac.authorization.k8s.io/v1 | |
kind: ClusterRoleBinding | |
metadata: | |
name: ambassador | |
roleRef: | |
apiGroup: rbac.authorization.k8s.io | |
kind: ClusterRole | |
name: ambassador | |
subjects: | |
- kind: ServiceAccount | |
name: {{ .Values.rbac.service_account.name }} | |
namespace: {{ .Release.Namespace }} | |
{{- end }} |
Good catch, this would indeed be redundant since https://github.com/helm/charts/blob/master/stable/ambassador/templates/rbac.yaml would be generated with rbac enabled. Mind you, it is cluster wide and not configurable to be namespace scoped. (e.g. ClusterRole not Role) |
Same thing for Datawire's chart FWIW https://github.com/datawire/ambassador/blob/master/helm/ambassador/templates/rbac.yaml |
Ahh.. that is an issue. We need to allow DevOps to deploy with just Roles if namespaced and ClusterRoles if clusterwide. This allows fine grained permissions for users. Maybe this was the main blocker for not using the Helm/Datawire chart... ? |
I suppose it is! We can take the grafana chart as an example where they do this: |
Yes - we should get a PR for the upstream change. Are you willing to do that and link to this PR? |
Sure! |
helm/charts#11354 created. |
I've made the required changes pre-emptively assuming the upstream PR will go through (removed ambassador rbac related objects, added |
Alright we are good to with regards to Something worth noting is that we still have the |
bump |
One knock-on effect of this is that the labels on ambassador become different. This is used in notebooks which match using "service=ambassador". Using the ambassador chart we get a standard |
Am also noticing that when I run through the helm_examples notebook with this change the grpc call for the a/b test fails with
The REST call works. Both calls work when using master branch. |
Turns out the "StatusCode.UNIMPLEMENTED" is a known issue not related to this PR and happens intermittently on GRPC requests, especially the first one. After waiting a while and trying again it did work for the PR branch. Will raise that issue separately. |
I'm working on getting the E2E tests to pass for these changes in a seldon-core-ambassador-pr445 branch based on this. Currently hit a problem with multi-armed bandit test where the seldon-container-engine container in the 'mymab-abtest' pod never becomes ready: Not seeing this problem on master and have merged master into seldon-core-ambassador-pr445 branch. Not sure how it can be related to ambassador though as the eg-router that it is looking for really isn't there. The eg-router is in the graph when I do But no service is created for the eg-router. I took the json from the cluster-manager log output:
Then I ran it through a json compare tool against epsilon_greedy.json and couldn't see any important differences. Next thing I'll try is to see whether a notebook that uses it works for me from the seldon-core-ambassador-pr445 branch. |
When stepping through the notebook the Am now running E2E tests again and they have got past this. I guess it was non-deterministic failure. Perhaps it's working now because I've restarted minikube. So now I do have the E2E tests running but have some failures related to connections. |
So the above gist shows 13 failures and now down to 7. A lot of grpc call failures resolved in part through adding resource to minikube and specify resources on ambassador (this dealt with pod restarts I was seeing) and also by adding retries to deal with #473 But now it seems there are REST-related failures, some even in the ksonnet tests and not helm. This is strange and don't see those when running particular tests in isolation. Also need to map the current use of |
The main problem with those failures was that the labels became different for a helm vs ksonnet deployment and the code to port-forward is common and matches on labels. So have changed the ambassador labels for ksonnet too Now the only remaining problems seem to be related to enabling cluster-wide access for ambassador as with the latest changes those particular tests are failing consistently. The problem now seems to be that I'm setting the single namespace env var to false and it actually doesn't support that - it works by being set or not set and ignores the content of the value. Unfortunately I don't see a way to default this env var to enable single namespace in the seldon-core chart and turn it off in particular situations. I tried setting it to "" but that still counts as on. I also tried setting the The way it was done in the datawire version of the chart would've enabled us to default to off and then turn on. But that isn't the official chart anymore. We could change our defaults to default to cluster wide or we submit a change to official chart and/or raise the issue with ambassador. |
Have created a branch with the above changes and changing the default for ambassador to cluster wide as a workaround for the namespace var issue. The workaround can be reversed if the helm/charts PR gets merged. So closing this PR and replacing with #480 |
Tackles #258
Please review!
Thanks