Description
Problem
registry-facade runs as DaemonSet on each node and the container runtime (e.g. containerd) pull through this service. The service is available through a hostPort
on the node directly, exactly because it needs to be accessed from outside of Kubernetes.
When restarting the registry-facade service, e.g. while deploying, there's a service dis-/interruption. Currently running image pulls break because the service goes down. New image pulls fail because there's no service available in this moment.
Possible Solutions
Graceful socket handover: when a new registry-facade starts, it checks if there's already a service running. If so, it requests a handover from the old instance. The new facade would take over the listening socket, and place the old one in a "draining mode". We'd need to allow for a generous termination grace period.