Cherry-pick #18564 to 7.8: [Autodiscover] Check if runner is already running before starting again #18689

ChrsMark · 2020-05-21T11:35:44Z

Cherry-pick of PR #18564 to 7.8 branch. Original message:

What does this PR do?

This PR fixes runner reload so as not to start a new runner if a runner for the same configuration is already running. This can happen in Autodiscover if we have a container queued for termination and a new one with the very same configuration. This will lead into having 2 identical configurations in reload. The first one will be skipped but the second one will create new runner while the previous is still running. This is the tricky if/else block that cause this problem when we have 2 identical configurations:

beats/libbeat/cfgfile/list.go

Line 73 in e990740

if _, ok := stopList[hash]; ok {

For more information check the related Discuss topic: https://discuss.elastic.co/t/multiple-monitoring-cycles-after-recreating-docker-image/231565/9

Why is it important?

In case of autodiscovery catches a new start event will try to start a new runner without checking if a runner is already running. This will lead in overriding the in list of runner the old one with the new one without stoping the old one. The result will be to have 2 runners running (one will be orphan and untracked).

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Enable autodiscover:

metricbeat.autodiscover:
  providers:
    - type: docker
      templates:
        - condition:
            contains:
              docker.container.image: prometheus
          config:
            - module: prometheus
              metricsets: ["collector"]
              hosts: "${data.host}:${data.port}"

Start Metricbeat: ./metricbeat -e -d "module,autodiscover"
Start a container that matches the template using this docker-compose project: https://github.com/ChrsMark/docker-prometheus-playground
Edit the Prometheus service by adding a new label on it:

  prometheus:
    labels:
    - "some=Some"

Restart the service with docker-compose up -d
Verify that no new runner start:
There is no: 2020-05-15T08:25:01.563Z DEBUG [module] module/wrapper.go:127 Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
And there is:

2020-05-21T07:47:11.487Z	DEBUG	[autodiscover]	cfgfile/list.go:62	Starting reload procedure, current runners: 1
2020-05-21T07:47:11.487Z	DEBUG	[autodiscover]	cfgfile/list.go:80	Start list: 0, Stop list: 0

Related issues

Discuss: https://discuss.elastic.co/t/multiple-monitoring-cycles-after-recreating-docker-image/231565/9

This might solve #12011 too.

…in (elastic#18564) (cherry picked from commit b0f7ae7)

elasticmachine · 2020-05-21T11:37:42Z

Pinging @elastic/integrations (Team:Integrations)

elasticmachine · 2020-05-21T11:37:43Z

Pinging @elastic/integrations-platforms (Team:Platforms)

elasticmachine · 2020-05-21T11:40:38Z

💚 Build Succeeded

Expand to view the summary

Build stats

Build Cause: [Pull request #18689 updated]
Start Time: 2020-05-21T11:37:46.727+0000
Duration: 74 min 16 sec

Test stats 🧪

Test	Results
Failed	0
Passed	6612
Skipped	1053
Total	7665

Steps errors

Expand to view the steps failures

Name: Report to Codecov
- Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done
- Duration: 2 min 22 sec
- Start Time: 2020-05-21T12:14:44.622+0000
- log

[Autodiscover] Check if runner is already running before starting aga…

c06cf5f

…in (elastic#18564) (cherry picked from commit b0f7ae7)

ChrsMark added backport review labels May 21, 2020

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 21, 2020

Update CHANGELOG.next.asciidoc

fed8578

ChrsMark added Team:Integrations Label for the Integrations team Team:Platforms Label for the Integrations - Platforms team labels May 21, 2020

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 21, 2020

kvch approved these changes May 21, 2020

View reviewed changes

ChrsMark merged commit e4c0fed into elastic:7.8 May 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cherry-pick #18564 to 7.8: [Autodiscover] Check if runner is already running before starting again #18689

Cherry-pick #18564 to 7.8: [Autodiscover] Check if runner is already running before starting again #18689

ChrsMark commented May 21, 2020

elasticmachine commented May 21, 2020

elasticmachine commented May 21, 2020

elasticmachine commented May 21, 2020 •

edited

Loading

Build stats

Test stats 🧪

Cherry-pick #18564 to 7.8: [Autodiscover] Check if runner is already running before starting again #18689

Cherry-pick #18564 to 7.8: [Autodiscover] Check if runner is already running before starting again #18689

Conversation

ChrsMark commented May 21, 2020

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Related issues

elasticmachine commented May 21, 2020

elasticmachine commented May 21, 2020

elasticmachine commented May 21, 2020 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

Steps errors

elasticmachine commented May 21, 2020 •

edited

Loading