Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

Out of memory error both on #1777

Open
AppElent opened this issue Feb 19, 2023 · 0 comments
Open

Out of memory error both on #1777

AppElent opened this issue Feb 19, 2023 · 0 comments

Comments

@AppElent
Copy link

Chart version:

Kubernetes version: 1.23.12

Kubernetes provider: AKS

Helm Version: 3.11

helm get release output

e.g. helm get elasticsearch (replace elasticsearch with the name of your helm release)

Be careful to obfuscate every secrets (credentials, token, public IP, ...) that could be visible in the output before copy-pasting.

If you find some secrets in plain text in helm get release output you should use Kubernetes Secrets to managed them is a secure way (see Security Example).

Output of helm get release

NAME: elastic
LAST DEPLOYED: Sun Feb 19 09:22:30 2023
NAMESPACE: elastic
STATUS: deployed
REVISION: 1
USER-SUPPLIED VALUES:
antiAffinity: soft
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
createCert: true
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xms128m -Xmx128m
esJvmOptions: {}
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fullnameOverride: ""
healthNameOverride: ""
hostAliases: []
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 8.5.1
ingress:
annotations: {}
className: nginx
enabled: true
hosts:

  • host: elastic.appelent.com
    paths:
    • path: /
      pathtype: ImplementationSpecific
      tls: []
      initResources:
      limits:
      cpu: 800m
      memory: 250Mi
      requests:
      cpu: 100m
      memory: 250Mi
      keystore: []
      labels: {}
      lifecycle: {}
      masterService: ""
      maxUnavailable: 1
      minimumMasterNodes: 1
      nameOverride: ""
      networkHost: 0.0.0.0
      networkPolicy:
      http:
      enabled: false
      transport:
      enabled: false
      nodeAffinity: {}
      nodeGroup: master
      nodeSelector: {}
      persistence:
      annotations: {}
      enabled: true
      labels:
      enabled: false
      podAnnotations: {}
      podManagementPolicy: Parallel
      podSecurityContext:
      fsGroup: 1000
      runAsUser: 1000
      podSecurityPolicy:
      create: false
      name: ""
      spec:
      fsGroup:
      rule: RunAsAny
      privileged: true
      runAsUser:
      rule: RunAsAny
      seLinux:
      rule: RunAsAny
      supplementalGroups:
      rule: RunAsAny
      volumes:
    • secret
    • configMap
    • persistentVolumeClaim
    • emptyDir
      priorityClassName: ""
      protocol: https
      rbac:
      automountToken: true
      create: false
      serviceAccountAnnotations: {}
      serviceAccountName: ""
      readinessProbe:
      failureThreshold: 3
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 3
      timeoutSeconds: 5
      replicas: 1
      resources:
      limits:
      cpu: 800m
      memory: 250Mi
      requests:
      cpu: 100m
      memory: 250Mi
      roles:
  • master
  • data
  • data_content
  • data_hot
  • data_warm
  • data_cold
  • ingest
  • ml
  • remote_cluster_client
  • transform
    schedulerName: ""
    secret:
    enabled: true
    password: ""
    secretMounts: []
    securityContext:
    capabilities:
    drop:
    • ALL
      runAsNonRoot: true
      runAsUser: 1000
      service:
      annotations: {}
      enabled: true
      externalTrafficPolicy: ""
      httpPortName: http
      labels: {}
      labelsHeadless: {}
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      nodePort: ""
      publishNotReadyAddresses: false
      transportPortName: transport
      type: ClusterIP
      sysctlInitContainer:
      enabled: true
      sysctlVmMaxMapCount: 262144
      terminationGracePeriod: 120
      tests:
      enabled: true
      tolerations: []
      transportPort: 9300
      updateStrategy: RollingUpdate
      volumeClaimTemplate:
      accessModes:
    • ReadWriteOnce
      resources:
      requests:
      storage: 30Gi
      storageClassName: managed-csi

COMPUTED VALUES:
antiAffinity: soft
antiAffinityTopologyKey: kubernetes.io/hostname
clusterHealthCheckParams: wait_for_status=green&timeout=1s
clusterName: elasticsearch
createCert: true
enableServiceLinks: true
envFrom: []
esConfig: {}
esJavaOpts: -Xms128m -Xmx128m
esJvmOptions: {}
esMajorVersion: ""
extraContainers: []
extraEnvs: []
extraInitContainers: []
extraVolumeMounts: []
extraVolumes: []
fullnameOverride: ""
healthNameOverride: ""
hostAliases: []
httpPort: 9200
image: docker.elastic.co/elasticsearch/elasticsearch
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 8.5.1
ingress:
annotations: {}
className: nginx
enabled: true
hosts:

  • host: elastic.appelent.com
    paths:
    • path: /
      pathtype: ImplementationSpecific
      tls: []
      initResources:
      limits:
      cpu: 800m
      memory: 250Mi
      requests:
      cpu: 100m
      memory: 250Mi
      keystore: []
      labels: {}
      lifecycle: {}
      masterService: ""
      maxUnavailable: 1
      minimumMasterNodes: 1
      nameOverride: ""
      networkHost: 0.0.0.0
      networkPolicy:
      http:
      enabled: false
      transport:
      enabled: false
      nodeAffinity: {}
      nodeGroup: master
      nodeSelector: {}
      persistence:
      annotations: {}
      enabled: true
      labels:
      enabled: false
      podAnnotations: {}
      podManagementPolicy: Parallel
      podSecurityContext:
      fsGroup: 1000
      runAsUser: 1000
      podSecurityPolicy:
      create: false
      name: ""
      spec:
      fsGroup:
      rule: RunAsAny
      privileged: true
      runAsUser:
      rule: RunAsAny
      seLinux:
      rule: RunAsAny
      supplementalGroups:
      rule: RunAsAny
      volumes:
    • secret
    • configMap
    • persistentVolumeClaim
    • emptyDir
      priorityClassName: ""
      protocol: https
      rbac:
      automountToken: true
      create: false
      serviceAccountAnnotations: {}
      serviceAccountName: ""
      readinessProbe:
      failureThreshold: 3
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 3
      timeoutSeconds: 5
      replicas: 1
      resources:
      limits:
      cpu: 800m
      memory: 250Mi
      requests:
      cpu: 100m
      memory: 250Mi
      roles:
  • master
  • data
  • data_content
  • data_hot
  • data_warm
  • data_cold
  • ingest
  • ml
  • remote_cluster_client
  • transform
    schedulerName: ""
    secret:
    enabled: true
    password: ""
    secretMounts: []
    securityContext:
    capabilities:
    drop:
    • ALL
      runAsNonRoot: true
      runAsUser: 1000
      service:
      annotations: {}
      enabled: true
      externalTrafficPolicy: ""
      httpPortName: http
      labels: {}
      labelsHeadless: {}
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      nodePort: ""
      publishNotReadyAddresses: false
      transportPortName: transport
      type: ClusterIP
      sysctlInitContainer:
      enabled: true
      sysctlVmMaxMapCount: 262144
      terminationGracePeriod: 120
      tests:
      enabled: true
      tolerations: []
      transportPort: 9300
      updateStrategy: RollingUpdate
      volumeClaimTemplate:
      accessModes:
    • ReadWriteOnce
      resources:
      requests:
      storage: 30Gi
      storageClassName: managed-csi

HOOKS:

Source: elasticsearch/templates/test/test-elasticsearch-health.yaml

apiVersion: v1
kind: Pod
metadata:
name: "elastic-alhvw-test"
annotations:
"helm.sh/hook": test
"helm.sh/hook-delete-policy": hook-succeeded
spec:
securityContext:
fsGroup: 1000
runAsUser: 1000
containers:

  • name: "elastic-xrbtk-test"
    env:
    • name: ELASTIC_PASSWORD
      valueFrom:
      secretKeyRef:
      name: elasticsearch-master-credentials
      key: password
      image: "docker.elastic.co/elasticsearch/elasticsearch:8.5.1"
      imagePullPolicy: "IfNotPresent"
      command:
    • "sh"
    • "-c"
    • |
      #!/usr/bin/env bash -e
      curl -XGET --fail --cacert /usr/share/elasticsearch/config/certs/tls.crt -u "elastic:${ELASTIC_PASSWORD}" https://'elasticsearch-master:9200/_cluster/health?wait_for_status=green&timeout=1s'
      volumeMounts:
    • name: elasticsearch-certs
      mountPath: /usr/share/elasticsearch/config/certs
      readOnly: true
      restartPolicy: Never
      volumes:
    • name: elasticsearch-certs
      secret:
      secretName: elasticsearch-master-certs
      MANIFEST:

Source: elasticsearch/templates/poddisruptionbudget.yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: "elasticsearch-master-pdb"
spec:
maxUnavailable: 1
selector:
matchLabels:
app: "elasticsearch-master"

Source: elasticsearch/templates/secret-cert.yaml

apiVersion: v1
kind: Secret
type: kubernetes.io/tls
metadata:
name: elasticsearch-master-certs
labels:
app: elasticsearch-master
chart: "elasticsearch"
heritage: Helm
release: elastic
data:
tls.crt:
tls.key:
ca.crt:

Source: elasticsearch/templates/secret.yaml

apiVersion: v1
kind: Secret
metadata:
name: elasticsearch-master-credentials
labels:
heritage: "Helm"
release: "elastic"
chart: "elasticsearch"
app: "elasticsearch-master"
type: Opaque
data:
username: ZWxhc3RpYw==
password: ""

Source: elasticsearch/templates/service.yaml

kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "elastic"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
{}
spec:
type: ClusterIP
selector:
release: "elastic"
chart: "elasticsearch"
app: "elasticsearch-master"
publishNotReadyAddresses: false
ports:

  • name: http
    protocol: TCP
    port: 9200
  • name: transport
    protocol: TCP
    port: 9300

Source: elasticsearch/templates/service.yaml

kind: Service
apiVersion: v1
metadata:
name: elasticsearch-master-headless
labels:
heritage: "Helm"
release: "elastic"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
clusterIP: None # This is needed for statefulset hostnames like elasticsearch-0 to resolve

Create endpoints also if the related pod isn't ready

publishNotReadyAddresses: true
selector:
app: "elasticsearch-master"
ports:

  • name: http
    port: 9200
  • name: transport
    port: 9300

Source: elasticsearch/templates/statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-master
labels:
heritage: "Helm"
release: "elastic"
chart: "elasticsearch"
app: "elasticsearch-master"
annotations:
esMajorVersion: "8"
spec:
serviceName: elasticsearch-master-headless
selector:
matchLabels:
app: "elasticsearch-master"
replicas: 1
podManagementPolicy: Parallel
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:

  • metadata:
    name: elasticsearch-master
    spec:
    accessModes:

    • ReadWriteOnce
      resources:
      requests:
      storage: 30Gi
      storageClassName: managed-csi
      template:
      metadata:
      name: "elasticsearch-master"
      labels:
      release: "elastic"
      chart: "elasticsearch"
      app: "elasticsearch-master"
      annotations:

    spec:
    securityContext:
    fsGroup: 1000
    runAsUser: 1000
    automountServiceAccountToken: true
    affinity:
    podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
    podAffinityTerm:
    topologyKey: kubernetes.io/hostname
    labelSelector:
    matchExpressions:
    - key: app
    operator: In
    values:
    - "elasticsearch-master"
    terminationGracePeriodSeconds: 120
    volumes:
    - name: elasticsearch-certs
    secret:
    secretName: elasticsearch-master-certs
    enableServiceLinks: true
    initContainers:

    • name: configure-sysctl
      securityContext:
      runAsUser: 0
      privileged: true
      image: "docker.elastic.co/elasticsearch/elasticsearch:8.5.1"
      imagePullPolicy: "IfNotPresent"
      command: ["sysctl", "-w", "vm.max_map_count=262144"]
      resources:
      limits:
      cpu: 800m
      memory: 250Mi
      requests:
      cpu: 100m
      memory: 250Mi

    containers:

    • name: "elasticsearch"
      securityContext:
      capabilities:
      drop:
      - ALL
      runAsNonRoot: true
      runAsUser: 1000
      image: "docker.elastic.co/elasticsearch/elasticsearch:8.5.1"
      imagePullPolicy: "IfNotPresent"
      readinessProbe:
      exec:
      command:
      - bash
      - -c
      - |
      set -e

          # Exit if ELASTIC_PASSWORD in unset
          if [ -z "${ELASTIC_PASSWORD}" ]; then
            echo "ELASTIC_PASSWORD variable is missing, exiting"
            exit 1
          fi
      
          # If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )
          # Once it has started only check that the node itself is responding
          START_FILE=/tmp/.es_start_file
      
          # Disable nss cache to avoid filling dentry cache when calling curl
          # This is required with Elasticsearch Docker using nss < 3.52
          export NSS_SDB_USE_CACHE=no
      
          http () {
            local path="${1}"
            local args="${2}"
            set -- -XGET -s
      
            if [ "$args" != "" ]; then
              set -- "$@" $args
            fi
      
            set -- "$@" -u "elastic:${ELASTIC_PASSWORD}"
      
            curl --output /dev/null -k "$@" "https://127.0.0.1:9200${path}"
          }
      
          if [ -f "${START_FILE}" ]; then
            echo 'Elasticsearch is already running, lets check the node is healthy'
            HTTP_CODE=$(http "/" "-w %{http_code}")
            RC=$?
            if [[ ${RC} -ne 0 ]]; then
              echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with RC ${RC}"
              exit ${RC}
            fi
            # ready if HTTP code 200, 503 is tolerable if ES version is 6.x
            if [[ ${HTTP_CODE} == "200" ]]; then
              exit 0
            elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then
              exit 0
            else
              echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}"
              exit 1
            fi
      
          else
            echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )'
            if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then
              touch ${START_FILE}
              exit 0
            else
              echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
              exit 1
            fi
          fi
      

      failureThreshold: 3
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 3
      timeoutSeconds: 5
      ports:

      • name: http
        containerPort: 9200
      • name: transport
        containerPort: 9300
        resources:
        limits:
        cpu: 800m
        memory: 250Mi
        requests:
        cpu: 100m
        memory: 250Mi
        env:
        • name: node.name
          valueFrom:
          fieldRef:
          fieldPath: metadata.name
        • name: cluster.initial_master_nodes
          value: "elasticsearch-master-0,"
        • name: node.roles
          value: "master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform,"
        • name: discovery.seed_hosts
          value: "elasticsearch-master-headless"
        • name: cluster.name
          value: "elasticsearch"
        • name: network.host
          value: "0.0.0.0"
        • name: ELASTIC_PASSWORD
          valueFrom:
          secretKeyRef:
          name: elasticsearch-master-credentials
          key: password
        • name: ES_JAVA_OPTS
          value: "-Xms128m -Xmx128m"
        • name: xpack.security.enabled
          value: "true"
        • name: xpack.security.transport.ssl.enabled
          value: "true"
        • name: xpack.security.http.ssl.enabled
          value: "true"
        • name: xpack.security.transport.ssl.verification_mode
          value: "certificate"
        • name: xpack.security.transport.ssl.key
          value: "/usr/share/elasticsearch/config/certs/tls.key"
        • name: xpack.security.transport.ssl.certificate
          value: "/usr/share/elasticsearch/config/certs/tls.crt"
          readOnly: true

Source: elasticsearch/templates/ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: elasticsearch-master
labels:
app: elasticsearch
release: elastic
heritage: Helm
spec:
ingressClassName: "nginx"
rules:

  • host: elastic.appelent.com
    http:
    paths:
    • path: /
      pathType: ImplementationSpecific
      backend:
      service:
      name: elasticsearch-master
      port:
      number: 9200

NOTES:

  1. Watch all cluster members come up.
    $ kubectl get pods --namespace=elastic -l app=elasticsearch-master -w
  2. Retrieve elastic user's password.
    $ kubectl get secrets --namespace=elastic elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
  3. Test cluster health using Helm test.
    $ helm --namespace=elastic test elastic

Describe the bug:
I have a small personal setup and i want to try elasticsearch but i keep getting out of memory errors. Is there some minimal setup that works? My cluster is giving this error:

PS D:\Dev\appelent-monorepo\aks\monitoring> kubectl get pods --namespace=elastic -l app=elasticsearch-master -w
NAME                     READY   STATUS     RESTARTS   AGE
elasticsearch-master-0   0/1     Init:0/1   0          6s
elasticsearch-master-0   0/1     Init:0/1   0          58s
elasticsearch-master-0   0/1     PodInitializing   0          64s
elasticsearch-master-0   0/1     Running           0          65s
elasticsearch-master-0   0/1     OOMKilled         0          81s
elasticsearch-master-0   0/1     Running           1 (2s ago)   82s
elasticsearch-master-0   0/1     OOMKilled         1 (17s ago)   97s
elasticsearch-master-0   0/1     CrashLoopBackOff   1 (14s ago)   111s
elasticsearch-master-0   0/1     Running            2 (14s ago)   111s
elasticsearch-master-0   0/1     OOMKilled          2 (30s ago)   2m7s
elasticsearch-master-0   0/1     CrashLoopBackOff   2 (11s ago)   2m18s
elasticsearch-master-0   0/1     Running            3 (26s ago)   2m33s
elasticsearch-master-0   0/1     OOMKilled          3 (42s ago)   2m49s
elasticsearch-master-0   0/1     CrashLoopBackOff   3 (11s ago)   3m
elasticsearch-master-0   0/1     Running            4 (51s ago)   3m40s
elasticsearch-master-0   0/1     OOMKilled          4 (66s ago)   3m55s
elasticsearch-master-0   0/1     CrashLoopBackOff   4 (15s ago)   4m9s
elasticsearch-master-0   0/1     Running            5 (82s ago)   5m16s
elasticsearch-master-0   0/1     OOMKilled          5 (97s ago)   5m31s
elasticsearch-master-0   0/1     CrashLoopBackOff   5 (15s ago)   5m46s
elasticsearch-master-0   0/1     Running            6 (2m43s ago)   8m14s
elasticsearch-master-0   0/1     OOMKilled          6 (2m59s ago)   8m30s
elasticsearch-master-0   0/1     CrashLoopBackOff   6 (11s ago)     8m41s

Tried different settings, just working with 1 replica.

Steps to reproduce:

Expected behavior:

Provide logs and/or server output (if relevant):

Be careful to obfuscate every secrets (credentials, token, public IP, ...) that could be visible in the output before copy-pasting

Any additional context:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant