-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBM Cloud deployment improvements #447
Conversation
infrastructure/helm/quantumserverless/charts/gateway/templates/deployment.yaml
Outdated
Show resolved
Hide resolved
infrastructure/helm/quantumserverless/charts/repository/templates/deployment.yaml
Outdated
Show resolved
Hide resolved
@psschwei I don't know if you can help me debugging a problem that I found. Installing the helm I'm not being able to find |
Looks like the tag should be |
Totally right @psschwei , hours fighting yesterday and there was a |
Is this PR the same config we used to set up the cluster for the dev forum? |
Yes but with some on-real-time changes 😂 . Like for example we improved the |
I'm working on those points right now btw. |
@Tansito tests are failing because: #509 (comment) |
Yes, I was commenting w/ @psschwei and @IceKhan13 in the issue, thank you @pacomf ! |
@Tansito can you merge main here? tests are fixed now :) |
@@ -33,19 +33,19 @@ spec: | |||
{{- toYaml .Values.securityContext | nindent 12 }} | |||
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" | |||
imagePullPolicy: {{ .Values.image.pullPolicy }} | |||
command: [ "/usr/src/app/entrypoint.sh", "gunicorn", "main.wsgi:application", "--bind", "0.0.0.0:8000", "--workers=3" ] | |||
args: [ "gunicorn", "main.wsgi:application", "--bind", "0.0.0.0:{{ .Values.service.port }}", "--workers=4" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, that was a problem of not having migrations :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly, it seems that command
overrides the entrypoint
that you can have in your container (you always learn something new).
httpGet: | ||
path: /metrics | ||
port: http | ||
# livenessProbe: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only comment is this. Maybe we should uncomment this? :)
or /metrics
is not available? or do /api/v1/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are totally right. My two cents here from previous experiences is that the security team will ask us for two different end-points for liveness
and readiness
for a simple HTTP 200 and a DB access. For the DB access I think /api/v1
can work (I would need to confirm it) but for the HTTP 200 we will need a specific end-point. That's why I commented them. If you are agree @IceKhan13 I can create an issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense :) Yeah, let's add something like /health
or something like this in following PRs
this failed to install for me:
(looks like I broke something in ce747be -- seems like we need the grafana-agent CRDs or maybe even the entire operator...) not for this PR, but we may want to look into something like kustomize for some of the provider-specific overlays: https://jfrog.com/blog/power-up-helm-charts-using-kustomize-to-manage-kubernetes-deployments/ also, one minor nit: can we update the readme for IBM-specific install instructions? |
@psschwei I'm going to update this PR with
I'm always open for improvements. Before introduce more things in the infrastructure I would like to work too in the tests: #117 and #118
You are totally right. I will review it now. |
@psschwei I can confirm you. Regardless the warning in:
I could deploy it in IBM Cloud without a problem. |
Yeah, the warning has been there for a while (I think I opened an issue for it sometime back)... the errors were new for me (and also weird, because I didn't get them before...). On kustomize I'm thinking two things:
not any hurry on that, more of a day 2 / nice to have at this point. Agree testing would be better for now 😄 |
# Conflicts: # infrastructure/helm/quantumserverless/README.md # infrastructure/helm/quantumserverless/values.yaml
Someone this week recommended helmfile as a potentially better fit, so we have options 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
monumental work of setting this all up 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we go! 🚀
Configure providers in staging and prod
Summary
Testing our current configuration in IBM Cloud. Fix #423
Details and comments
Due to the differences between the deployment in local vs IBM Cloud I made more dynamic our Ingress configuration with the purpose to be able to support AWS too without the need to change the template if not only the values (as a normal user would do).
nginxconfig
in favor of adding those annotations iningress
.ray-ingress
to justingress
now that we are going to expose different applicationsingress
now is more dynamic so you can add the annotations and host configuration through the values file.gateway
+repository
are going to be exposed I changed theserviceType
of almost all the services toClusterIP
.machine_type
,workers_per_zone
...helm
integration interraform
due to now helm release requires values from the cluster after its creation (investigating if we can obtain them).ray-cluster
changed from a fixed worker to a dynamic set of workersResources