Skip to content

Commit

Permalink
update-core. Fix update deadlock (#578)
Browse files Browse the repository at this point in the history
When an agent receives signal USR1 it stops processing new tasks. During
the execution of update-core, exclude the cluster agent from the
graceful restart (or reload) request, so it can still process its
self-submitted "update-module" tasks.
  • Loading branch information
DavidePrincipi authored Feb 16, 2024
1 parent 2130ae1 commit 83f114b
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash

#
# Copyright (C) 2024 Nethesis S.r.l.
# SPDX-License-Identifier: GPL-3.0-or-later
#

exec 1>&2
set -e

# Ask the cluster agent to restart. To avoid update deadlocks, this must
# be the last step of the cluster/update-core action. The node agent
# already sent the USR1 signal to other agents, now it's our turn:
pkill -USR1 -f -- 'agent --agentid=cluster'
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,34 @@
exec 1>&2
set -e

# Reload agents gracefully
killall -q -s USR1 -r '^agent$'
function reload_leader_agents ()
{
# Send reload signal to all agents, except the cluster agent to avoid
# a deadlock in the update-core action!
agent_cluster_pid=$(pgrep -f -- 'agent --agentid=cluster')
for agent_pid in $(pgrep -x agent) ; do
if [[ "${agent_pid}" != "${agent_cluster_pid}" ]]; then
kill -USR1 "${agent_pid}"
fi
done
}

function reload_worker_agents ()
{
# Send reload signal to all agents. The cluster agent in worker nodes
# can be reloaded safely.
pkill -USR1 -x agent
}

# connect to local redis replica with full read-only access
export REDIS_USER="default"
export REDIS_PASSWORD="default"
export REDIS_ADDRESS="127.0.0.1:6379"
leader_id=$(redis-exec hget cluster/environment NODE_ID)

if [[ "${NODE_ID}" != "${leader_id}" ]] ; then
reload_worker_agents
else
# Leader node from here.
reload_leader_agents
fi

0 comments on commit 83f114b

Please sign in to comment.