Non-functional deployment with helm chart #562

maxime-sourdin · 2022-09-26T09:57:35Z

Hello,
I tried to deploy Oncall with helm chart (on a managed Kubernetes cluster, via ArgoCD), using the built-in Mysql.

I am seeing problems as the database migration jobs are not going through:

amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.

When I did the tests based on the provided docker-compose, I had the same problem.

Here is an extract from the values file:

base_url: example.com
image:
  repository: grafana/oncall
  tag: "v1.0.37"
  pullPolicy: IfNotPresent
service:
  enabled: false
  type: LoadBalancer
  port: 8080
  annotations: {}
engine:
  replicaCount: 1
  resources: {}
celery:
  replicaCount: 1
  resources: {}
oncall:
  slack:
    enabled: false
    command: ~
    clientId: ~
    clientSecret: ~
    apiToken: ~
    apiTokenCommon: ~
  telegram:
    enabled: false
    token: ~
    webhookUrl: ~
migrate:
  enabled: true
env: []
ingress:
  enabled: false
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/issuer: "letsencrypt-prod"
  tls: 
    - hosts:
        - "{{ .Values.base_url }}"
      secretName: certificate-tls
  extraPaths: []
ingress-nginx:
  enabled: false
cert-manager:
  enabled: false
  installCRDs: false
  webhook:
    timeoutSeconds: 30
    securePort: 10260
  podDnsPolicy: None
  podDnsConfig:
    nameservers:
      - 8.8.8.8
      - 1.1.1.1
mariadb:
  enabled: true
  persistence:
    enabled: true
    storageClass: "csi-ssd-disk-topology"  
  auth:
    database: oncall
  primary:
    persistence:
      enabled: true
      storageClass: "csi-ssd-disk-topology"  
    extraEnvVars:
    - name: MARIADB_COLLATE
      value: utf8mb4_unicode_ci
    - name: MARIADB_CHARACTER_SET
      value: utf8mb4
  secondary:
    persistence:
      enabled: true
      storageClass: "csi-ssd-disk-topology"  
    extraEnvVars:
    - name: MARIADB_COLLATE
      value: utf8mb4_unicode_ci
    - name: MARIADB_CHARACTER_SET
      value: utf8mb4
externalMysql:
  host:
  port:
  db_name:
  user:
  password:
rabbitmq:
  enabled: true
  persistence:
    enabled: true
    storageClass: "csi-ssd-disk-topology"  
externalRabbitmq:
  host:
  port:
  user:
  password:
  protocol:
  vhost:
redis:
  enabled: true
  architecture: standalone
  replica:
    count: 1
    persistence:
      enabled: true
      storageClass: "csi-ssd-disk-topology"
  master:
    count: 1
    persistence:
      enabled: true
      storageClass: "csi-ssd-disk-topology"  
externalRedis:
  host:
  password:
grafana:
  enabled: false
  grafana.ini:
    server:
      domain: example.com
      root_url: "%(protocol)s://%(domain)s/grafana"
      serve_from_sub_path: true
  persistence:
    enabled: false
  plugins:
    - grafana-oncall-app
nameOverride: ""
fullnameOverride: ""
serviceAccount:
  create: true
  annotations: {}
  name: ""
podAnnotations: {}
podSecurityContext: {}
  # fsGroup: 2000
securityContext: {}
init:
  securityContext: {}

What could be the cause of this problem?

The text was updated successfully, but these errors were encountered:

Matvey-Kuk · 2022-09-27T13:22:49Z

Hi! AMPQ makes me think it's about network connectivity between RabbitMQ and container performing migration.

maxime-sourdin · 2022-09-27T15:07:03Z

Hi! AMPQ makes me think it's about network connectivity between RabbitMQ and container performing migration.

Hello,
I redeploy Oncall with new volumes, rabbitmq is now OK.

2022-09-27 15:00:00.098389+00:00 [info] <0.4429.0> accepting AMQP connection <0.4429.0> (172.16.0.177:42506 -> 172.16.0.47:5672)
2022-09-27 15:00:00.101205+00:00 [info] <0.4429.0> connection <0.4429.0> (172.16.0.177:42506 -> 172.16.0.47:5672): user 'user' authenticated and granted access to vhost '/'

But I've got now these messages:

Operations to perform:
  Apply all migrations: admin, alerts, auth, auth_token, base, contenttypes, heartbeat, migration_tool, oss_installation, push_notifications, schedules, sessions, silk, slack, social_django, telegram, twilioapp, user_management
Running migrations:
  No migrations to apply.
  Your models in app(s): 'push_notifications', 'silk', 'social_django' have changes that are not yet reflected in a migration, and so won't be applied.
  Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.

Last time I tested to make these migrations and it stucks

iskhakov · 2022-10-06T04:26:15Z

Such a message tells that all the migrations are already applied

maxime-sourdin · 2022-10-06T07:57:42Z

Hello @iskhakov ,
so the problem is not migration, I'll change the title of the issue then

I don't have any other issue except the migration one (so that's not one), but the oncall-engine pod keeps restarting

I also just noticed this error:
2022-10-06 07:54:44 lock engine: pthread robust mutexes
2022-10-06 07:54:44 thunder lock: disabled (you can enable it with --thunder-lock)
2022-10-06 07:54:44 Listen queue size is greater than the system max net.core.somaxconn (128).

MadEngineX · 2022-11-15T12:50:14Z

Hello @iskhakov , so the problem is not migration, I'll change the title of the issue then

I don't have any other issue except the migration one (so that's not one), but the oncall-engine pod keeps restarting

I also just noticed this error: 2022-10-06 07:54:44 lock engine: pthread robust mutexes 2022-10-06 07:54:44 thunder lock: disabled (you can enable it with --thunder-lock) 2022-10-06 07:54:44 Listen queue size is greater than the system max net.core.somaxconn (128).

Faced same issue. I found that smb had removed net.core.somaxconn property for docker-compose #84. I removed env UWSGI_LISTEN in k8s Deployment, and engine starts.

maxime-sourdin changed the title ~~Non-functional database migrations during deployment with helm chart~~ Non-functional deployment with helm chart Oct 6, 2022

MadEngineX mentioned this issue Nov 15, 2022

Remove net.core.somaxconn in Helm Chart #856

Closed

3 tasks

alexintech mentioned this issue May 29, 2023

Helm chart: added configuration of uwsgi using environment variables #2045

Merged

3 tasks

joeyorlando closed this as completed in 8049b50 May 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-functional deployment with helm chart #562

Non-functional deployment with helm chart #562

maxime-sourdin commented Sep 26, 2022

Matvey-Kuk commented Sep 27, 2022

maxime-sourdin commented Sep 27, 2022

iskhakov commented Oct 6, 2022

maxime-sourdin commented Oct 6, 2022

MadEngineX commented Nov 15, 2022 •

edited

Loading

Non-functional deployment with helm chart #562

Non-functional deployment with helm chart #562

Comments

maxime-sourdin commented Sep 26, 2022

Matvey-Kuk commented Sep 27, 2022

maxime-sourdin commented Sep 27, 2022

iskhakov commented Oct 6, 2022

maxime-sourdin commented Oct 6, 2022

MadEngineX commented Nov 15, 2022 • edited Loading

MadEngineX commented Nov 15, 2022 •

edited

Loading