Triggered reboot on worker hangs indefinitely

I have CLUO running on my K8s cluster with CoreOS on both controller and worker nodes.

The reboots triggered on the controllers successfully complete, but the reboots on worker nodes hang indefinitely. For example: 

```I0406 16:33:58.958771       1 main.go:45] /bin/update-agent running
I0406 16:33:58.958962       1 agent.go:84] Setting info labels
I0406 16:33:58.988317       1 agent.go:98] Setting annotations map[string]string{"container-linux-update.v1.coreos.com/reboot-in-progress":"false", "container-linux-update.v1.coreos.com/reboot-needed":"false"}
I0406 16:34:46.834224       1 agent.go:110] Marking node as schedulable
I0406 16:34:46.858979       1 agent.go:120] Waiting for ok-to-reboot from controller...
I0406 16:34:46.859150       1 agent.go:246] Beginning to watch update_engine status
I0406 16:34:46.860756       1 agent.go:198] Updating status
I0406 16:34:46.860780       1 agent.go:210] Indicating a reboot is needed
I0406 16:35:51.649523       1 agent.go:134] Setting annotations map[string]string{"container-linux-update.v1.coreos.com/reboot-in-progress":"true"}
I0406 16:35:51.682890       1 agent.go:146] Marking node as unschedulable
I0406 16:35:51.701994       1 agent.go:151] Getting pod list for deletion
I0406 16:35:51.761232       1 agent.go:160] Deleting 4 pods
. . .
<all pods deleted>
. . .
I0406 16:36:32.164977       1 agent.go:184] Node drained, rebooting
```

Once this completes, the node is cordoned and should reboot, but the reboot itself never occurs. 

Where should I check first to help debug this?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Triggered reboot on worker hangs indefinitely #177

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Triggered reboot on worker hangs indefinitely #177

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions