Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lws controller cannot reconcile the pod to the right status #391

Closed
wangyuan249 opened this issue Feb 17, 2025 · 10 comments · May be fixed by #394
Closed

lws controller cannot reconcile the pod to the right status #391

wangyuan249 opened this issue Feb 17, 2025 · 10 comments · May be fixed by #394
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@wangyuan249
Copy link

wangyuan249 commented Feb 17, 2025

What happened:
lws controller cannot reconcile the pod to the right status

kubectl apply -f deepseek-lws.yaml
apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
  name: sglang
  namespace: sg-system
  labels:
    app: sglang
spec:
  replicas: 1
  startupPolicy: LeaderCreated
  rolloutStrategy:
    type: RollingUpdate
    rollingUpdateConfiguration:
      maxSurge: 0
      maxUnavailable: 2
  leaderWorkerTemplate:
    size: 2
    restartPolicy: RecreateGroupOnPodRestart
    leaderTemplate:
      metadata:
        labels:
          role: leader
      spec:
        containers:
          - name: sglang-head
            image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
            imagePullPolicy: IfNotPresent
            workingDir: /sgl-workspace
            command: ["sh", "-c"]
            args:
            - >
              cd /sgl-workspace && python3 -m sglang.launch_server
              --model-path /root/.cache/modelscope/DeepSeek-R1
              --served-model-name deepseek-r1
              --tp 16
              --dist-init-addr $LWS_LEADER_ADDRESS:20000
              --nnodes $LWS_GROUP_SIZE
              --node-rank 0
              --trust-remote-code
              --context-length 131072
              --enable-metrics
              --host 0.0.0.0
              --port 8000
            env:
              - name: GLOO_SOCKET_IFNAME
                value: eth0
              - name: NCCL_IB_HCA
                value: "mlx5_0,mlx5_1,mlx5_4,mlx5_5"
              - name: NCCL_P2P_LEVEL
                value: "NVL"
              - name: NCCL_IB_GID_INDEX
                value: "0"
              - name: NCCL_IB_CUDA_SUPPORT
                value: "1"
              - name: NCCL_IB_DISABLE
                value: "0"
              - name: NCCL_SOCKET_IFNAME
                value: "eth0"
                #value: "ibs13,ibs11,ibs15,ibs17"
              - name: NCCL_DEBUG
                value: "INFO"
              - name: NCCL_NET_GDR_LEVEL
                value: "2"
              - name: POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: SGLANG_USE_MODELSCOPE
                value: "true"
            ports:
            - containerPort: 8000
              name: http
              protocol: TCP
            - containerPort: 20000
              name: distributed
              protocol: TCP
            resources:
              limits:
                cpu: "128"
                memory: "1Ti"
                nvidia.com/gpu: "8"
                rdma/ib: "4"
              requests:
                cpu: "128"
                memory: "1Ti"
                nvidia.com/gpu: "8"
                rdma/ib: "4"
            securityContext:
              capabilities:
                add:
                - IPC_LOCK
                - SYS_PTRACE
            volumeMounts:
              - mountPath: /root/.cache/modelscope
                name: modelscope-cache
              - mountPath: /dev/shm
                name: shm-volume
              - name: localtime
                mountPath: /etc/localtime
                readOnly: true
            readinessProbe:
              tcpSocket:
                port: 8000
              initialDelaySeconds: 120
              periodSeconds: 30
        volumes:
          - name: modelscope-cache
            hostPath:
              path: /file_CPU_01/modelServing
          - name: shm-volume
            emptyDir:
              sizeLimit: 512Gi
              medium: Memory
          - name: localtime
            hostPath:
              path: /etc/localtime
              type: File
        schedulerName: volcano
    workerTemplate:
      metadata:
        name: sglang-worker
      spec:
        containers:
          - name: sglang-worker
            image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
            imagePullPolicy: IfNotPresent
            workingDir: /sgl-workspace
            command: ["sh", "-c"]
            args:
            - >
              cd /sgl-workspace && python3 -m sglang.launch_server
              --model-path /home/bmm-system/data/ckpt/deepseek/DeepSeek-R1
              --served-model-name deepseek-r1
              --tp 16
              --dist-init-addr $LWS_LEADER_ADDRESS:20000
              --nnodes $LWS_GROUP_SIZE
              --node-rank $LWS_WORKER_INDEX
              --trust-remote-code
              --context-length 131072
              --enable-metrics
              --host 0.0.0.0
              --port 8000
            env:
              - name: GLOO_SOCKET_IFNAME
                value: eth0
              - name: NCCL_IB_HCA
                value: "mlx5_0,mlx5_1,mlx5_4,mlx5_5"
              - name: NCCL_P2P_LEVEL
                value: "NVL"
              - name: NCCL_IB_GID_INDEX
                value: "0"
              - name: NCCL_IB_CUDA_SUPPORT
                value: "1"
              - name: NCCL_IB_DISABLE
                value: "0"
              - name: NCCL_SOCKET_IFNAME
                value: "eth0"
                #value: "ibs13,ibs11,ibs15,ibs17"
              - name: NCCL_DEBUG
                value: "INFO"
              - name: NCCL_NET_GDR_LEVEL
                value: "2"
              - name: SGLANG_USE_MODELSCOPE
                value: "true"
              - name: LWS_WORKER_INDEX
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.labels['leaderworkerset.sigs.k8s.io/worker-index']
            ports:
            - containerPort: 8000
              name: http
              protocol: TCP
            - containerPort: 20000
              name: distributed
              protocol: TCP
            resources:
              limits:
                cpu: "128"
                memory: "1Ti"
                nvidia.com/gpu: "8"
                rdma/ib: "4"
              requests:
                cpu: "128"
                memory: "1Ti"
                nvidia.com/gpu: "8"
                rdma/ib: "4"
            securityContext:
              capabilities:
                add:
                - IPC_LOCK
                - SYS_PTRACE
            volumeMounts:
            - mountPath: /root/.cache/modelscope
              name: modelscope-cache
            - mountPath: /dev/shm
              name: shm-volume
            - name: localtime
              mountPath: /etc/localtime
              readOnly: true
        nodeSelector:
          glm.ai/app: infer
        volumes:
        - name: modelscope-cache
          hostPath:
            path: /file_CPU_01/modelServing
        - name: shm-volume
          emptyDir:
            sizeLimit: 512Gi
            medium: Memory
        - name: localtime
          hostPath:
            path: /etc/localtime
            type: File
        schedulerName: volcano

Image

Status:
  Conditions:
    Last Transition Time:  2025-02-17T12:31:55Z
    Message:               Replicas are progressing
    Reason:                GroupsProgressing
    Status:                True
    Type:                  Progressing
  Hpa Pod Selector:        leaderworkerset.sigs.k8s.io/name=sglang,leaderworkerset.sigs.k8s.io/worker-index=0
  Replicas:                1
  Updated Replicas:        24
Events:
  Type    Reason             Age                 From             Message
  ----    ------             ----                ----             -------
  Normal  CreatingRevision   36s                 leaderworkerset  Creating revision with key 7c7756fcbb for a newly created LeaderWorkerSet
  Normal  GroupsProgressing  36s                 leaderworkerset  Created leader statefulset sglang
  Normal  GroupsProgressing  36s (x2 over 36s)   leaderworkerset  Replicas are progressing, with 0 groups ready of total 1 groups
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0-0-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0-0-0-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0-0-0-0-0
  Normal  GroupsProgressing  36s                 leaderworkerset  Created worker statefulset for leader pod sglang-0-0-0-0-0-0-0
  Normal  GroupsUpdating     36s                 leaderworkerset  Rolling Upgrade is in progress, with 0 groups ready of total 1 groups
  Normal  GroupsProgressing  33s (x13 over 36s)  leaderworkerset  (combined from similar events): Created worker statefulset for leader pod sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0

What you expected to happen:

Image

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • LWS version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@wangyuan249 wangyuan249 added the kind/bug Categorizes issue or PR as related to a bug. label Feb 17, 2025
@ardaguclu
Copy link
Contributor

kubectl get lws/sglang -oyaml would highlight the issue?.

@wangyuan249
Copy link
Author

I tried it, kubectl get lws/sglang -oyaml get nothing useful info

Here are some logs from lws-controller pods

2025-02-17T08:15:32Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "ccb77735-c278-4697-abcc-ebc7fa36828d"}
2025-02-17T08:15:32Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "ef783140-59c4-459f-a50a-6695db067dbc"}
2025-02-17T08:15:34Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "0ff24cfb-e94d-4ea3-badc-46366aa505c4"}
2025-02-17T08:15:36Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "c8986930-9709-4cb7-98d1-a3484859f15e"}
2025-02-17T08:15:41Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "82edabe1-fdd7-4493-92f7-c15e94181c0f"}
2025-02-17T08:15:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "11dc6474-efb6-411a-9150-7d313d1b4daf"}
2025-02-17T08:16:12Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "66ee5c3b-ff85-4fa3-8dd3-a1c8a718d32a"}
2025-02-17T08:16:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "4f25fe8f-10a9-4944-a998-b9b2043506e3"}
2025-02-17T08:18:15Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"default"}, "namespace": "default", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "17f6bf13-e81d-4896-9dc8-1e9e82667ebb"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "5257a663-bbef-4e58-bade-507ff5ff5da9"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "340fac61-99d5-49d4-968f-8b968d578c8f"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "78420896-5d19-459c-802c-dc45768686c3"}
2025/02/17 08:22:48 http: TLS handshake error from 10.76.233.2:24727: EOF
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "de191f2b-39fa-4bae-b66d-24b43105f8e7"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "47feb47c-985b-4b26-b630-5665e0f84599"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "ad2094cd-4643-44da-9900-3be2904a8fd1"}
2025-02-17T08:22:48Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "820ddd95-050e-46a2-8ec5-f6b952c1a1df"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "1640b651-ba29-4438-8869-04d16b41901d"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "7ff0b7f4-b89e-42a8-8314-98655ed81f6a"}
2025/02/17 08:22:49 http: TLS handshake error from 10.76.233.2:57343: EOF
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "ccb5f990-3ff1-49a4-9a47-7d0b1608c74a"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "4a6f6060-3115-413e-bdb4-3ead4a637a26"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "18e96bd8-e3be-4f44-a135-feb997ca0915"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "945a1b5b-2922-46eb-9373-686150d355de"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "92ff6473-4524-4046-8b17-1bc5346281c4"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "2cf54b1f-1486-45da-9e17-8ea8562580c0"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "8cce681b-eac6-4134-92ed-597f119f9e85"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "455e7616-8064-4783-bf02-9553394f8a08"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "a3f2d469-01ea-44a8-b6e8-e540f33f40f7"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "fe0cb48c-2506-4bad-8164-08ec8dbd2fb0"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "040084c9-539e-49a5-9dd6-e63d1a8120f2"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "f7185df7-c087-4b08-96d9-1ce4bad1a8b8"}
2025-02-17T08:22:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "b70a5ae7-56b4-4f31-bbf6-10a684b96083"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "af3138de-86f0-414f-b329-11f40b13c3d7"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "aebb944b-e179-4d43-a191-7ed30b69555e"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "75a54d4d-6a7f-4c20-b651-16accb964763"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "bc8a3f40-7dab-4376-a97c-6b7bd2a19842"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "3c606d9b-60e4-4de0-ab9c-0bf349bc944b"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "19b6c164-494b-41d2-9367-c88477f00764"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "64f88675-5bba-42fa-b8eb-5f1185449a57"}
2025-02-17T08:22:50Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "75ad5550-d9ab-4890-bbe7-16dfe66e011a"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "5544945e-ab4f-4633-b407-f508af05e271"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "a45fb2b1-ab52-49be-b8eb-47fd7029602a"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "b5b676a2-a8fa-4eb6-8540-7ed1d4d8773a"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "4de1069e-0056-4c1f-92cd-30e776b8b574"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "96c8ec89-a72e-42cd-b908-c62b80f231ea"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "84dd3c3d-dabe-439a-a7e1-3230d645fc55"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "804ec37c-1ebc-4f8e-a05e-6cfb009602ff"}
2025-02-17T08:22:51Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "857f3cb6-3aaa-4f34-8f9a-e03e96d18125"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "001a9318-20d5-49c5-9155-6b34b38e89b6"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "d98de902-2da9-41f3-8701-62f2c6d1cb00"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "a1895855-3928-4429-b1b3-e6145ddbcb35"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "9246194a-914c-4bcd-be06-5121cba505fc"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "8b1a5f97-dee4-439c-a51a-819ae3b14307"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "09b0642e-4ee1-4081-82ac-3ddaf43d935d"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "6e2895f9-56a5-4e90-ac38-b24f112c9bbe"}
2025-02-17T08:22:52Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "dd6ad57d-a127-4249-ad00-d27978c508dd"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "538375db-383e-4a35-bde0-cca15512d59e"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:volcano-system:volcano-controllers", "requestID": "3cb503f8-68bc-471e-b0b7-174c2e6e7ce2"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "543455f7-6ad8-4f09-ab4d-60704f6bc5bb"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "19749b3b-fc39-43f3-b333-36a750f3b95c"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "5418d531-0c51-4a0f-852e-a3ce1691256a"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "aa8bb16a-d209-403a-ba81-bf1ad550d54c"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "bf920707-edff-4270-b3a8-e163153f7112"}
2025-02-17T08:22:53Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "ccab9d08-99d0-4144-96a8-ba032ce29b13"}
2025-02-17T08:22:54Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "51771bfa-cbfa-4ff4-b0c8-5fc85e6ecb55"}
2025-02-17T08:22:54Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "2a077c9e-f72a-411a-bd9e-71153eb504fb"}
2025-02-17T08:22:54Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "4f4810e1-31a8-4049-ad2a-16b6afff6232"}
2025-02-17T08:22:56Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "a6685c4d-b0b5-4bad-9d6e-35b5c9168689"}
2025-02-17T08:22:58Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "c2bdcdf8-2e69-4d93-adb8-c68c7f886106"}
2025-02-17T08:23:03Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "d5e16e69-eb7f-4721-87ac-17e131b355b9"}
2025-02-17T08:23:14Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "4e232246-7981-422c-9d0f-ca8028142ef1"}
2025-02-17T08:23:34Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "29eda014-c206-46c6-bb06-02daac18712a"}
2025-02-17T08:24:15Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "c6e4333a-ed1c-45e7-9b55-e3d070d4d71f"}
2025-02-17T08:25:37Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "9a010d10-e840-4b68-b171-96b687346810"}
2025-02-17T08:33:49Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "caa3528d-62e9-41dd-8eb8-2b88fff612cc"}
2025-02-17T09:01:24Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "952c5f96-3102-453d-99e0-6883156e8cb5"}
2025-02-17T09:34:44Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "e7e8877e-cef2-4c73-9626-bb54207a226b"}
2025-02-17T10:08:04Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "c78ac555-0d9e-44a3-8e9c-4de5abd16511"}
2025-02-17T10:41:24Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "9eae8993-f169-4911-95c1-ea239c8caabd"}
2025-02-17T11:14:44Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "f3df9b92-9878-4348-9fe1-22799c5c616a"}
2025-02-17T11:48:05Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "3a4ccd22-34f8-4493-92fc-d2a0cf53a4cf"}
2025-02-17T12:21:25Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "de05a37c-6631-4cc5-aa07-706fb12643e3"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "be181899-223e-4445-adbb-1126a61b9827"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "db74bc77-f444-440c-a4d0-fad809f038a4"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "c99348c3-9c72-471c-8d36-f907bfe94dd3"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "52ec9031-afed-41cb-b2d4-6d25ba89623a"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "d36789e5-a03a-446f-a536-94912196f155"}
2025/02/17 12:31:55 http: TLS handshake error from 10.76.233.2:16136: EOF
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "87be7b2c-d5c8-426e-baf2-b60889d1b413"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "d10dd482-f5c9-4acd-a0b0-ba8806fc6204"}
2025-02-17T12:31:55Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "0d0f72fd-5889-4159-9e0d-07c0ed8c12aa"}
2025-02-17T12:31:56Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "ed912b91-caa8-4828-b97c-4221843fc715"}
2025-02-17T12:31:56Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "dae7b74f-246e-4148-9eaf-f32ee1550cb9"}
2025-02-17T12:31:56Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "bfc2ddc0-d6c0-4561-8edd-999a8fa8970b"}
2025-02-17T12:31:56Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "62a1cbc8-4b41-40ba-937c-dcc6e504327e"}
2025-02-17T12:31:57Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "b9220f67-a5e1-4b31-8c7c-0ceb47b1742a"}
2025-02-17T12:31:57Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "76081f6f-9683-462c-8258-9b9af49aa6f9"}
2025-02-17T12:31:57Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "d86300ff-bdb2-45b3-9968-75931b58bed2"}
2025-02-17T12:31:57Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "a9cad2d7-1b3a-4a88-b89c-c998a7570e0e"}
2025-02-17T12:31:58Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "1366d26d-db12-45f5-9cef-00ba65baaff4"}
2025-02-17T12:31:58Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "779a9934-c423-4429-a568-f236d374e9df"}
2025-02-17T12:31:58Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "e0c49f4d-a895-46d3-b8d5-ef28c9d8c6c9"}
2025-02-17T12:31:58Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "2fb25b46-efe6-4f08-bf0c-7735d307fe12"}
2025-02-17T12:31:58Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "4c8627ed-fa3a-4428-b675-6e94f2f0a350"}
2025-02-17T12:31:59Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "7f4fcb42-867a-42c7-aa50-7b3842bd70f5"}
2025-02-17T12:31:59Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "d769d5d6-b7b2-421f-80a3-b6491b606634"}
2025-02-17T12:31:59Z    LEVEL(-2)       admission       Validating Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "b7ed6924-076a-4ec0-bc00-c3d98e601886"}
2025-02-17T12:37:27Z    LEVEL(-2)       admission       Defaulting Pod  {"webhookGroup": "", "webhookKind": "Pod", "Pod": {"name":"sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0","namespace":"sg-system"}, "namespace": "sg-system", "name": "sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0", "resource": {"group":"","version":"v1","resource":"pods"}, "user": "system:serviceaccount:kube-system:statefulset-controller", "requestID": "635ab2f3-3f32-4587-ab5c-e34252630939"}

@yankay
Copy link
Member

yankay commented Feb 17, 2025

HI @wangyuan249
Would you please share the kubectl get statefulset -o yaml and upload the kubectl cluster-info dump --all-namespaces > cluster-dump.json
it will be a great help :-)

@wangyuan249
Copy link
Author

wangyuan249 commented Feb 17, 2025

the whole statefulset list info is too long, more than 2000 columns.
I only Paste three of them.

cluster-dump.json is also too long more than 10000 columns.
Its not convenient to provide.

Image

kubectl get statefulset sglang sglang-0  sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0 -n sg-system -oyaml
apiVersion: v1
items:
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    annotations:
      leaderworkerset.sigs.k8s.io/replicas: "1"
    creationTimestamp: "2025-02-17T12:31:55Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang
    namespace: sg-system
    ownerReferences:
    - apiVersion: leaderworkerset.x-k8s.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: LeaderWorkerSet
      name: sglang
      uid: dbfecf50-ec6d-4fb2-8ce3-27b237e36fcf
    resourceVersion: "119298183"
    uid: 1229116c-a37a-4e40-ae96-1cceb16e867c
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/name: sglang
        leaderworkerset.sigs.k8s.io/worker-index: "0"
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
          leaderworkerset.sigs.k8s.io/worker-index: "0"
          role: leader
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /root/.cache/modelscope/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank 0 --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-head
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          readinessProbe:
            failureThreshold: 3
            initialDelaySeconds: 120
            periodSeconds: 30
            successThreshold: 1
            tcpSocket:
              port: 8000
            timeoutSeconds: 1
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentReplicas: 1
    currentRevision: sglang-6d6bb86845
    observedGeneration: 1
    replicas: 1
    updateRevision: sglang-6d6bb86845
    updatedReplicas: 1
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2025-02-17T12:31:55Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/group-index: "0"
      leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang-0
    namespace: sg-system
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      controller: true
      kind: Pod
      name: sglang-0
      uid: 2f3612cb-e275-4447-9163-f25c1b367414
    resourceVersion: "119297814"
    uid: 2e017e12-b2d3-40ba-a3f7-c9e0e83a21d2
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/group-index: "0"
        leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
        leaderworkerset.sigs.k8s.io/name: sglang
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/leader-name: sglang-0
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/group-index: "0"
          leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
        name: sglang-worker
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /home/bmm-system/data/ckpt/deepseek/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank $LWS_WORKER_INDEX --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          - name: LWS_WORKER_INDEX
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.labels['leaderworkerset.sigs.k8s.io/worker-index']
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-worker
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        nodeSelector:
          glm.ai/app: infer
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentReplicas: 1
    currentRevision: sglang-0-777d858dd9
    observedGeneration: 1
    replicas: 1
    updateRevision: sglang-0-777d858dd9
    updatedReplicas: 1
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2025-02-17T12:31:59Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/group-index: "0"
      leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
    namespace: sg-system
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      controller: true
      kind: Pod
      name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
      uid: 2ee0fe39-d072-4b75-ace0-805571552534
    resourceVersion: "119298274"
    uid: 7e1c95ba-d3da-4838-8daf-2c8be98fd3aa
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/group-index: "0"
        leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
        leaderworkerset.sigs.k8s.io/name: sglang
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/leader-name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/group-index: "0"
          leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
        name: sglang-worker
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /home/bmm-system/data/ckpt/deepseek/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank $LWS_WORKER_INDEX --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          - name: LWS_WORKER_INDEX
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.labels['leaderworkerset.sigs.k8s.io/worker-index']
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-worker
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        nodeSelector:
          glm.ai/app: infer
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentRevision: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-54646b5968
    observedGeneration: 1
    replicas: 0
    updateRevision: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-54646b5968
kind: List
metadata:
  resourceVersion: ""

@yankay
Copy link
Member

yankay commented Feb 17, 2025

Image

kubectl get statefulset sglang sglang-0  sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0 -n sg-system -oyaml
apiVersion: v1
items:
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    annotations:
      leaderworkerset.sigs.k8s.io/replicas: "1"
    creationTimestamp: "2025-02-17T12:31:55Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang
    namespace: sg-system
    ownerReferences:
    - apiVersion: leaderworkerset.x-k8s.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: LeaderWorkerSet
      name: sglang
      uid: dbfecf50-ec6d-4fb2-8ce3-27b237e36fcf
    resourceVersion: "119298183"
    uid: 1229116c-a37a-4e40-ae96-1cceb16e867c
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/name: sglang
        leaderworkerset.sigs.k8s.io/worker-index: "0"
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
          leaderworkerset.sigs.k8s.io/worker-index: "0"
          role: leader
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /root/.cache/modelscope/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank 0 --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-head
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          readinessProbe:
            failureThreshold: 3
            initialDelaySeconds: 120
            periodSeconds: 30
            successThreshold: 1
            tcpSocket:
              port: 8000
            timeoutSeconds: 1
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentReplicas: 1
    currentRevision: sglang-6d6bb86845
    observedGeneration: 1
    replicas: 1
    updateRevision: sglang-6d6bb86845
    updatedReplicas: 1
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2025-02-17T12:31:55Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/group-index: "0"
      leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang-0
    namespace: sg-system
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      controller: true
      kind: Pod
      name: sglang-0
      uid: 2f3612cb-e275-4447-9163-f25c1b367414
    resourceVersion: "119297814"
    uid: 2e017e12-b2d3-40ba-a3f7-c9e0e83a21d2
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/group-index: "0"
        leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
        leaderworkerset.sigs.k8s.io/name: sglang
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/leader-name: sglang-0
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/group-index: "0"
          leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
        name: sglang-worker
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /home/bmm-system/data/ckpt/deepseek/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank $LWS_WORKER_INDEX --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          - name: LWS_WORKER_INDEX
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.labels['leaderworkerset.sigs.k8s.io/worker-index']
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-worker
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        nodeSelector:
          glm.ai/app: infer
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentReplicas: 1
    currentRevision: sglang-0-777d858dd9
    observedGeneration: 1
    replicas: 1
    updateRevision: sglang-0-777d858dd9
    updatedReplicas: 1
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2025-02-17T12:31:59Z"
    generation: 1
    labels:
      leaderworkerset.sigs.k8s.io/group-index: "0"
      leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
      leaderworkerset.sigs.k8s.io/name: sglang
      leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
    name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
    namespace: sg-system
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      controller: true
      kind: Pod
      name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
      uid: 2ee0fe39-d072-4b75-ace0-805571552534
    resourceVersion: "119298274"
    uid: 7e1c95ba-d3da-4838-8daf-2c8be98fd3aa
  spec:
    podManagementPolicy: Parallel
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        leaderworkerset.sigs.k8s.io/group-index: "0"
        leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
        leaderworkerset.sigs.k8s.io/name: sglang
    serviceName: sglang
    template:
      metadata:
        annotations:
          leaderworkerset.sigs.k8s.io/leader-name: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0
          leaderworkerset.sigs.k8s.io/size: "2"
        creationTimestamp: null
        labels:
          leaderworkerset.sigs.k8s.io/group-index: "0"
          leaderworkerset.sigs.k8s.io/group-key: cbcccc5e821876cecb113b1b78184186660801f9
          leaderworkerset.sigs.k8s.io/name: sglang
          leaderworkerset.sigs.k8s.io/template-revision-hash: 7c7756fcbb
        name: sglang-worker
      spec:
        containers:
        - args:
          - |
            cd /sgl-workspace && python3 -m sglang.launch_server --model-path /home/bmm-system/data/ckpt/deepseek/DeepSeek-R1 --served-model-name deepseek-r1 --tp 16 --dist-init-addr $LWS_LEADER_ADDRESS:20000 --nnodes $LWS_GROUP_SIZE --node-rank $LWS_WORKER_INDEX --trust-remote-code --context-length 131072 --enable-metrics --host 0.0.0.0 --port 8000
          command:
          - sh
          - -c
          env:
          - name: GLOO_SOCKET_IFNAME
            value: eth0
          - name: NCCL_IB_HCA
            value: mlx5_0,mlx5_1,mlx5_4,mlx5_5
          - name: NCCL_P2P_LEVEL
            value: NVL
          - name: NCCL_IB_GID_INDEX
            value: "0"
          - name: NCCL_IB_CUDA_SUPPORT
            value: "1"
          - name: NCCL_IB_DISABLE
            value: "0"
          - name: NCCL_SOCKET_IFNAME
            value: eth0
          - name: NCCL_DEBUG
            value: INFO
          - name: NCCL_NET_GDR_LEVEL
            value: "2"
          - name: SGLANG_USE_MODELSCOPE
            value: "true"
          - name: LWS_WORKER_INDEX
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.labels['leaderworkerset.sigs.k8s.io/worker-index']
          image: ccr.ccs.tencentyun.com/kason/sglang:v0.4.2.post4-cu125
          imagePullPolicy: IfNotPresent
          name: sglang-worker
          ports:
          - containerPort: 8000
            name: http
            protocol: TCP
          - containerPort: 20000
            name: distributed
            protocol: TCP
          resources:
            limits:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
            requests:
              cpu: "128"
              memory: 1Ti
              nvidia.com/gpu: "8"
              rdma/ib: "4"
          securityContext:
            capabilities:
              add:
              - IPC_LOCK
              - SYS_PTRACE
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /root/.cache/modelscope
            name: modelscope-cache
          - mountPath: /dev/shm
            name: shm-volume
          - mountPath: /etc/localtime
            name: localtime
            readOnly: true
          workingDir: /sgl-workspace
        dnsPolicy: ClusterFirst
        nodeSelector:
          glm.ai/app: infer
        restartPolicy: Always
        schedulerName: volcano
        securityContext: {}
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /file_CPU_01/modelServing
            type: ""
          name: modelscope-cache
        - emptyDir:
            medium: Memory
            sizeLimit: 512Gi
          name: shm-volume
        - hostPath:
            path: /etc/localtime
            type: File
          name: localtime
    updateStrategy:
      rollingUpdate:
        partition: 0
      type: RollingUpdate
  status:
    availableReplicas: 0
    collisionCount: 0
    currentRevision: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-54646b5968
    observedGeneration: 1
    replicas: 0
    updateRevision: sglang-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-0-54646b5968
kind: List
metadata:
  resourceVersion: ""

Great thanks. It seems I've identified the problem. The ordinals are not present in the StatefulSet; this feature is supported after Kubernetes 1.27 as detailed here: https://kubernetes.io/blog/2023/04/28/statefulset-start-ordinal/.

So, the feature may be disabled in the cluster. Could you help to print the kubectl version?

@wangyuan249
Copy link
Author

kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.26.2
WARNING: version difference between client (1.28) and server (1.26) exceeds the supported minor version skew of +/-1

@wangyuan249
Copy link
Author

oh I got it, our cluster is 1.26 。。。

@yankay
Copy link
Member

yankay commented Feb 17, 2025

ref to https://github.com/kubernetes-sigs/lws/blob/main/docs/setup/install.md#before-you-begin, the lws needs >=1.27

@kerthcet
Copy link
Contributor

Please close this if fixed your issue. Thanks!

@kerthcet
Copy link
Contributor

oh I got it, our cluster is 1.26 。。。

You can still own this feature in v1.26 if manually enabled the feature gate.

For any cluster with version lower than 1.27, you need to enable the feature gate for Start Ordinal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants