Improve docs of leaderelection configuration #20601

ChrsMark · 2020-08-14T10:51:55Z

Follow up of #20512.

Signed-off-by: chrismark <chrismarkou92@gmail.com>

elasticmachine · 2020-08-14T10:52:16Z

Pinging @elastic/integrations-platforms (Team:Platforms)

masci · 2020-08-14T11:25:18Z

Following up #20512 (comment)

I would say that the current approach is smoother for every case. I don't think we should remove the Deployment completely by now since in many cases it can be really helpful when it comes to scaling where someone may want to collect cluster-wide metrics from big clusters.

I'm not sure we should chase complex use cases but focus on onboarding instead.

With the current approach, users don't benefit from the simplified configuration provided by leader election out of the box: on the contrary, they have to learn about specifics of the feature through the comments and change the manifest in several points to simplify the setup, which is a bit of a paradox.

If I'm running a big, complex cluster requiring to fine-tune observability, chances are I won't even use the manifests as they are in the repo because the default resources provided to the standalone pods won't fit any possible case.

To sum up, I think with the current setup we're not providing a turnkey solution to anybody, while with leader election we have the chance to build a good user story to simplify onboarding.

elasticmachine · 2020-08-14T11:55:30Z

💚 Build Succeeded

Expand to view the summary

Build stats

Build Cause: [Pull request #20601 updated]
Start Time: 2020-09-02T08:51:51.369+0000
Duration: 61 min 20 sec

Test stats 🧪

Test	Results
Failed	0
Passed	2732
Skipped	722
Total	3454

Steps errors

Expand to view the steps failures

Name: Install Go 1.14.7
- Description: .ci/scripts/install-go.sh
- Duration: 1 min 32 sec
- Start Time: 2020-09-02T09:15:37.801+0000
- log
Name: Install docker-compose 1.21.0
- Description: .ci/scripts/install-docker-compose.sh
- Duration: 1 min 33 sec
- Start Time: 2020-09-02T09:15:47.840+0000
- log

ChrsMark · 2020-08-31T07:57:00Z

@exekias @jsoriano what do you think about @masci 's proposal of completely removing the old proposed manifests that include the Deployment and keep only the new approach with the leader election configuration within the Daemonset?

I'm ok with this, however I'm only concerned about users that already use Deployment manifests (breaking change ?).

jsoriano · 2020-08-31T09:48:29Z

@exekias @jsoriano what do you think about @masci 's proposal of completely removing the old proposed manifests that include the Deployment and keep only the new approach with the leader election configuration within the Daemonset?

Maybe we have to check how big a deployment needs to be to require a dedicated pod. There are two main reasons why this may be needed:

Collection of cluster-level metrics require significantly more resources, and configuring these resources in the DaemonSet will make most of the pods to reserve much more resources than they need.
Performance problems with leader election caused by having two many candidates.

If we are more or less sure that for most of the cases this is not going to be a problem, then I am ok with keeping the new approach only.

I'm ok with this, however I'm only concerned about users that already use Deployment manifests (breaking change ?).

We can keep a dummy deployment with the same name and zero replicas to replace the previous one avoiding a breaking change.

ChrsMark · 2020-09-01T07:20:03Z

@exekias @jsoriano what do you think about @masci 's proposal of completely removing the old proposed manifests that include the Deployment and keep only the new approach with the leader election configuration within the Daemonset?

Maybe we have to check how big a deployment needs to be to require a dedicated pod. There are two main reasons why this may be needed:

Collection of cluster-level metrics require significantly more resources, and configuring these resources in the DaemonSet will make most of the pods to reserve much more resources than they need.

Performance problems with leader election caused by having two many candidates.

If we are more or less sure that for most of the cases this is not going to be a problem, then I am ok with keeping the new approach only.

My feeling is that this kind of performance issues will hit users with real big clusters and if they have to fine tune their setup to resolve this issues it shouldn't be a big deal to evaluate the solution of the singleton Pod using a deployment. For most of our users keeping only the new approach from now on would improve their onboarding experience.

I'm ok with this, however I'm only concerned about users that already use Deployment manifests (breaking change ?).

We can keep a dummy deployment with the same name and zero replicas to replace the previous one avoiding a breaking change.

I think that having a dummy Deployment spec inside our manifests making things more complicated for users that are not so familiar with the whole setup. On the other hand I'm not sure if this can be considered a breaking change since docs are linked to the versions of the manifests and only old users that might want adopt the new approach might end up with "orphan" Deployment Pods on their clusters however this is something that not so unusual and it's not that bad.

In this, I'm leaning towards keeping only the new approach.
@masci @jsoriano @exekias let me know what you think.

exekias · 2020-09-01T07:54:32Z

It would sound like the general case would benefit from going only for the DaemonSet, and then we can document how to switch to a cluster-wide Deployment for big clusters.

This is not really a breaking change, as it won't break existing deployments, we are just changing how the new ones will happen.

Signed-off-by: chrismark <chrismarkou92@gmail.com>

jsoriano · 2020-09-01T09:19:36Z

Ok, should we mention something in the docs in case someone upgrades by using the new manifests?

"orphan" Deployment Pods on their clusters however this is something that not so unusual and it's not that bad.

Well, these orphan deployments will be collecting duplicated metrics, with an old version.

ChrsMark · 2020-09-01T09:49:51Z

Ok, should we mention something in the docs in case someone upgrades by using the new manifests?

New docs will only point to new manifests and only refer to the new approach. And wouldn't be an overkill to mention this in the old versions' docs?

"orphan" Deployment Pods on their clusters however this is something that not so unusual and it's not that bad.

Well, these orphan deployments will be collecting duplicated metrics, with an old version.

jsoriano · 2020-09-01T10:20:50Z

New docs will only point to new manifests and only refer to the new approach. And wouldn't be an overkill to mention this in the old versions' docs?

I meant to mention only something like "if you deployed these manifests before 7.10, you may need to manually remove the deployment". But we are not telling anything about upgrades so far, so maybe it is ok to don't say anything and users just change the versions in the manifests they are using.

metricbeat/docs/running-on-kubernetes.asciidoc

Signed-off-by: chrismark <chrismarkou92@gmail.com>

masci

Left few wording suggestions for the docs but this LGTM!

metricbeat/docs/running-on-kubernetes.asciidoc

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

(cherry picked from commit 6d7213f)

…ne-2.0 * upstream/master: (87 commits) [packaging] Normalise GCP bucket folder structure (elastic#20903) [Metricbeat] Add billing metricset into googlecloud module (elastic#20812) Include python docs in devguide index (elastic#20917) Avoid generating incomplete configurations in autodiscover (elastic#20898) Improve docs of leaderelection configuration (elastic#20601) Document how to set the ES host and Kibana URLs in Ingest Manager (elastic#20874) docs: Update beats for APM (elastic#20881) Adding cborbeat to community beats (elastic#20884) Bump kibana version to 7.9.0 in x-pack/metricbeat (elastic#20899) Kubernetes state_daemonset metricset for Metricbeat (elastic#20649) [Filebeat][zeek] Add new x509 fields to zeek (elastic#20867) [Filebeat][Gsuite] Add note about admin in gsuite docs (elastic#20855) Ensure kind cluster has RFC1123 compliant name (elastic#20627) Setup python paths in test runner configuration (elastic#20832) docs: Add `processor.event` info to Logstash output (elastic#20721) docs: update cipher suites (elastic#20697) [ECS] Update ecs to 1.6.0 (elastic#20792) Fix path in hits docs (elastic#20447) Update filebeat azure module documentation (elastic#20815) Remove duplicate ListGroupsForUsers in aws/cloudtrail (elastic#20788) ...

Improve docs of leaderelection configuration

0c05853

Signed-off-by: chrismark <chrismarkou92@gmail.com>

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 14, 2020

ChrsMark self-assigned this Aug 14, 2020

ChrsMark added the Team:Platforms Label for the Integrations - Platforms team label Aug 14, 2020

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 14, 2020

ChrsMark mentioned this pull request Aug 14, 2020

Add k8s manifest leveraging leaderelection #20512

Merged

7 tasks

masci mentioned this pull request Aug 17, 2020

Implement kubernetes agent cluster scope leader election #19731

Closed

remove deployment

d338177

Signed-off-by: chrismark <chrismarkou92@gmail.com>

jsoriano reviewed Sep 1, 2020

View reviewed changes

metricbeat/docs/running-on-kubernetes.asciidoc Outdated Show resolved Hide resolved

metricbeat/docs/running-on-kubernetes.asciidoc Outdated Show resolved Hide resolved

metricbeat/docs/running-on-kubernetes.asciidoc Outdated Show resolved Hide resolved

ChrsMark added 2 commits September 1, 2020 14:43

review changes

b794bae

Signed-off-by: chrismark <chrismarkou92@gmail.com>

Remove links to older versions of manifests

a6bff2e

Signed-off-by: chrismark <chrismarkou92@gmail.com>

jsoriano approved these changes Sep 1, 2020

View reviewed changes

jsoriano added needs_backport PR is waiting to be backported to other branches. v7.10.0 labels Sep 1, 2020

ChrsMark requested a review from masci September 1, 2020 14:22

masci approved these changes Sep 2, 2020

View reviewed changes

masci linked an issue Sep 2, 2020 that may be closed by this pull request

Implement kubernetes agent cluster scope leader election #19731

Closed

ChrsMark and others added 3 commits September 2, 2020 11:36

Update metricbeat/docs/running-on-kubernetes.asciidoc

21adb2f

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

8ebd859

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

df378bc

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

ChrsMark and others added 6 commits September 2, 2020 11:37

Update metricbeat/docs/running-on-kubernetes.asciidoc

32a0504

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

11ed4cb

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

567c70e

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

4751ca2

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

6c0b55b

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

Update metricbeat/docs/running-on-kubernetes.asciidoc

69ed73f

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

ChrsMark merged commit 6d7213f into elastic:master Sep 2, 2020

ChrsMark mentioned this pull request Sep 2, 2020

Cherry-pick #20601 to 7.x: Improve docs of leaderelection configuration #20916

Merged

ChrsMark added a commit to ChrsMark/beats that referenced this pull request Sep 2, 2020

Improve docs of leaderelection configuration (elastic#20601)

8ea42c8

(cherry picked from commit 6d7213f)

ChrsMark removed the needs_backport PR is waiting to be backported to other branches. label Sep 2, 2020

ChrsMark added a commit that referenced this pull request Sep 2, 2020

Improve docs of leaderelection configuration (#20601) (#20916)

8613a14

(cherry picked from commit 6d7213f)

This was referenced Oct 2, 2020

[docs] [Kubernetes] Fix leftover deployment example #21474

Merged

Cherry-pick #21474 to 7.x: [docs] [Kubernetes] Fix leftover deployment example #21505

Merged

melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020

Improve docs of leaderelection configuration (elastic#20601)

550905b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve docs of leaderelection configuration #20601

Improve docs of leaderelection configuration #20601

ChrsMark commented Aug 14, 2020

elasticmachine commented Aug 14, 2020

masci commented Aug 14, 2020

elasticmachine commented Aug 14, 2020 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

ChrsMark commented Aug 31, 2020

jsoriano commented Aug 31, 2020

ChrsMark commented Sep 1, 2020

exekias commented Sep 1, 2020 •

edited

Loading

jsoriano commented Sep 1, 2020

ChrsMark commented Sep 1, 2020

jsoriano commented Sep 1, 2020

masci left a comment

Improve docs of leaderelection configuration #20601

Improve docs of leaderelection configuration #20601

Conversation

ChrsMark commented Aug 14, 2020

elasticmachine commented Aug 14, 2020

masci commented Aug 14, 2020

elasticmachine commented Aug 14, 2020 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

Steps errors

ChrsMark commented Aug 31, 2020

jsoriano commented Aug 31, 2020

ChrsMark commented Sep 1, 2020

exekias commented Sep 1, 2020 • edited Loading

jsoriano commented Sep 1, 2020

ChrsMark commented Sep 1, 2020

jsoriano commented Sep 1, 2020

masci left a comment

Choose a reason for hiding this comment

elasticmachine commented Aug 14, 2020 •

edited by jenkins-beats-ci bot

Loading

exekias commented Sep 1, 2020 •

edited

Loading