From a40aca43b98f7fe707c4a154a0010fe5518507a0 Mon Sep 17 00:00:00 2001 From: Lior Avramov <73036155+liorghub@users.noreply.github.com> Date: Thu, 28 Jul 2022 02:18:36 +0300 Subject: [PATCH] [memory_checker] Do not check memory usage of containers if docker daemon is not running (#11476) Fix in Monit memory_checker plugin. Skip fetching running containers if docker engine is down (can happen in deinit). This PR fixes issue #11472. Signed-off-by: liora liora@nvidia.com Why I did it In the case where Monit runs during deinit flow, memory_checker plugin is fetching the running containers without checking if Docker service is still running. I added this check. How I did it Use systemctl is-active to check if Docker engine is still running. How to verify it Use systemctl to stop docker engine and reload Monit, no errors in log and relevant print appears in log. Which release branch to backport (provide reason below if selected) The fix is required in 202205 and 202012 since the PR that introduced the issue was cherry picked to those branches (#11129). --- files/image_config/monit/memory_checker | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/files/image_config/monit/memory_checker b/files/image_config/monit/memory_checker index dfe270e79524..a93bc30b3fe4 100755 --- a/files/image_config/monit/memory_checker +++ b/files/image_config/monit/memory_checker @@ -96,6 +96,19 @@ def check_memory_usage(container_name, threshold_value): sys.exit(4) +def is_service_active(service_name): + """Test if service is running. + + Args: + service_name: A string contains the service name + + Returns: + True if service is running, False otherwise + """ + status = subprocess.run("systemctl is-active --quiet {}".format(service_name), shell=True, check=False) + return status.returncode == 0 + + def get_running_container_names(): """Retrieves names of running containers by talking to the docker daemon. @@ -128,6 +141,12 @@ def main(): parser.add_argument("threshold_value", type=int, help="threshold value in bytes") args = parser.parse_args() + if not is_service_active("docker"): + syslog.syslog(syslog.LOG_INFO, + "[memory_checker] Exits without checking memory usage of container '{}' since docker daemon is not running!" + .format(args.container_name)) + sys.exit(0) + running_container_names = get_running_container_names() if args.container_name in running_container_names: check_memory_usage(args.container_name, args.threshold_value)