Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add complete k8s metadata through composable provider #27691

Merged
merged 17 commits into from
Sep 20, 2021

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Sep 1, 2021

What does this PR do?

Add all k8s metadata via the provider.

Why is it important?

To support full metadata access like in Beats.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

TODOs

  • Verify that short-living pods are handled properly.

How to test this PR locally

  1. Enable a static autodiscovery rule in Agent's config:
providers.kubernetes:
  scope: node
  kube_config: /Users/chrismark/.kube/config
  node: "kind-control-plane"
  cleanup_timeout: 360s
  resources:
    pod:
      enabled: true

inputs:
- name: redis
  type: redis/metrics
  use_output: default
  meta:
    package:
      name: redis
      version: 0.3.6
  data_stream:
    namespace: default
  streams:
    - data_stream:
        dataset: redis.info
        type: metrics
      metricsets:
        - info
      hosts:
        - '${kubernetes.pod.ip}:${kubernetes.container.port}'
      idle_timeout: 20s
      maxconn: 10
      network: tcp
      period: 10s
      condition: ${kubernetes.labels.app} == 'redis' AND ${kubernetes.container.port_name} == 'web'
  1. Deploy a Redis pod that triggers the condition:
apiVersion: v1
kind: Pod
metadata:
  name: redis
  labels:
    role: main
    app: redis
  annotations:
    # this had no meaning
    co.elastic.metrics.redis.XYZ/module: redis
spec:
  hostNetwork: true
  dnsPolicy: ClusterFirstWithHostNet
  containers:
    - name: redis
      image: redis:latest
      ports:
        - name: web
          containerPort: 6379
          protocol: TCP
  1. Run agent's inspect command to verify if an input is created ./elastic-agent -c ./elastic-agent.yml inspect output -o default
  2. Verify that an input is created and metadata are added properly via the processors, for example:
metricbeat:
  modules:
  - hosts:
    - 172.18.0.3:6379
    idle_timeout: 20s
    index: metrics-redis.info-default
    maxconn: 10
    meta:
      package:
        name: redis
        version: 0.3.6
    metricsets:
    - info
    module: redis
    name: redis
    network: tcp
    period: 10s
    processors:
    - add_fields:
        fields:
          id: cda93b8385201eaf725532e908e4a4c55895273d2a2cb57700b3cf48732b2de3
          image:
            name: redis:latest
          runtime: containerd
        target: container
    - add_fields:
        fields:
          container:
            name: redis
          labels:
            app: redis
            role: main
          namespace: default
          node:
            name: kind-control-plane
          pod:
            ip: 172.18.0.3
            name: redis
            uid: 365e8d21-a81d-44e2-acbf-95deb8a57730
        target: kubernetes
    - add_fields:
        fields:
          cluster:
            name: kind-kind
            url: https://127.0.0.1:50740
        target: orchestrator
    - add_fields:
        fields:
          dataset: redis.info
          namespace: default
          type: metrics
        target: data_stream
    - add_fields:
        fields:
          dataset: redis.info
        target: event
    - add_fields:
        fields:
          id: 32806c0a-4f04-499c-9427-e6e24e5f6035
          snapshot: false
          version: 8.0.0
        target: elastic_agent
    - add_fields:
        fields:
          id: 32806c0a-4f04-499c-9427-e6e24e5f6035
        target: agent

Using annotations (Note: annotations are not exposed to meta fields by default but are exposed in variable resolution mechanism)

Specify a condition based on an annotation like condition: ${kubernetes.annotations.level} == 'production' AND ${kubernetes.container.port_name} == 'web' and verify that input is created again. (note: tune Redis Pod manifest accordingly)

Short-living pods

Define a log input with condition based on container's name:

      - name: container-log
        type: logfile
        use_output: default
        meta:
          package:
            name: log
            version: 0.4.6
        data_stream:
          namespace: default
        streams:
          - data_stream:
              dataset: generic
            symlinks: true
            condition: ${kubernetes.container.name} == 'hello'
            paths:
              - /var/log/containers/*${kubernetes.container.id}.log

Deploy elastic-agent and verify that it collects logs for the following short-living pod:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: mytarge
  labels:
    app: mytarget
spec:
  schedule: "*/1 * * * *"
  failedJobsHistoryLimit: 10
  successfulJobsHistoryLimit: 20
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Screenshot 2021-09-15 at 3 56 18 PM

Short living Init containers are covered

---
apiVersion: v1
kind: Pod
metadata:
  name: mytarget
  labels:
    app: test
spec:
  initContainers:
    - name: test-init
      image: ubuntu:latest
      command:
        - bash
        - -c
        - |
          #!/bin/bash
          echo "$(date): started the INIT process"
          sleep 10
          echo "$(date): sleeping 10 INIT seconds"
  containers:
    - name: test
      image: ubuntu:latest
      command:
        - bash
        - -c
        - |
          #!/bin/bash
          echo "$(date): started the process"

          while :
          do
                 echo "$(date): sleeping 5 seconds"
                 sleep 5
          done

Ensure logs are captured even if Pod is marked for deletion

---
apiVersion: v1
kind: Pod
metadata:
  name: mytarget2
  labels:
    app: test
spec:
  containers:
    - name: test
      image: ubuntu:latest
      command:
        - bash
        - -c
        - |
          #!/bin/bash
          trap "{ echo '$(date): SIGTERM triggered'; sleep 3;echo '$(date): Bye bye'; sleep 5; echo '$(date): bye'; sleep 5; exit 1; }" SIGINT SIGTERM
          echo "$(date): started the process"

          while :
          do
                 echo "$(date): sleeping 5 seconds"
                 sleep 5
          done

Related issues

Notes for the reviewer

Most of the functionality is ported from libbeat/autodiscover/providers/kubernetes and libbeat/common/kubernetes/metadata.

Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark self-assigned this Sep 1, 2021
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 1, 2021
@ChrsMark ChrsMark added the Team:Integrations Label for the Integrations team label Sep 1, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 1, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Sep 1, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-09-20T08:56:23.919+0000

  • Duration: 144 min 25 sec

  • Commit: e702b31

Test stats 🧪

Test Results
Failed 0
Passed 54015
Skipped 5327
Total 59342

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 54015
Skipped 5327
Total 59342

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark added backport-v7.16.0 Automated backport with mergify v7.16.0 labels Sep 10, 2021
@ChrsMark
Copy link
Member Author

/test

@ChrsMark ChrsMark marked this pull request as ready for review September 10, 2021 10:30
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@ChrsMark
Copy link
Member Author

I'm opening this for an early review. I will need to add/tune tests as well as do extensive manual testing, however an early review would be more than welcome so as to spot any possible foundational issues early on.

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Member Author

Heads-up on this:

Unit tests were tuned accordingly and I'm running several manual testing scenarios to verify nothing is broken. I'm planning to work on adding e2e tests right after this one is merged (issue: elastic/e2e-testing#1090)

I have particularly tried to "port" the logic from the old autodiscovery feature to this one and try to cover cases like short-living jobs etc.

Review parts:

  • @jsoriano it would be super helpful if you could review the discovery lifecycle start/stop events , proper termination etc since you had worked on this recently :)
  • @blakerouse could you review this patch from the Agent's framework perspective?
  • @exekias one more review since you initially shipped the k8s provider :) ?
  • @MichaelKatsoulis the provider will not add kubernetes.container.image but you can verify the ECS parts, and feel free to review the patch in general.

Maybe in the same release we can tune the metadata part (in follow-ups) in order to be aligned with the outcomes of #13911 and more specifically #16483 and #14875

ExcludeLabels []string `config:"exclude_labels"`

LabelsDedot bool `config:"labels.dedot"`
AnnotationsDedot bool `config:"annotations.dedot"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be the default? Do we need this configurable now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the history of these settings. Maybe some users had requests for these? @jsoriano @exekias do you think we could remove those settings and have dedoting as an always-on feature?

AnnotationsDedot bool `config:"annotations.dedot"`

// Undocumented settings, to be deprecated in favor of `drop_fields` processor:
IncludeCreatorMetadata bool `config:"include_creator_metadata"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If undocumented and deprecated, why add it? Being this is all new to Elastic Agent, it could be the time to remove it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right this one is not exposed so it should be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this touches code that is used by beats too. I will remove it in follow up PR target 8.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR: #28006


AddResourceMetadata *metadata.AddResourceMetadataConfig `config:"add_resource_metadata"`
IncludeLabels []string `config:"include_labels"`
ExcludeLabels []string `config:"exclude_labels"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is IncludeLabels and ExcludeLabels used? I was not able to find them in the diff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are passed and used deeper in meta Generators, which is already existent codebase.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano it would be super helpful if you could review the discovery lifecycle start/stop events , proper termination etc since you had worked on this recently :)

This part looks good (waiting for e2e tests for confirmation 🙂), added only some thoughts, nothing really blocking.

@@ -9,6 +9,8 @@ import (

k8s "k8s.io/client-go/kubernetes"

"github.com/elastic/beats/v7/libbeat/common"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit.

Suggested change

Comment on lines 10 to 11
"github.com/elastic/beats/v7/libbeat/common/kubernetes/metadata"
"github.com/elastic/beats/v7/libbeat/common/safemapstr"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Move beats imports to the last group of imports.

for _, c := range pod.Spec.EphemeralContainers {
c := kubernetes.Container(c.EphemeralContainerCommon)
containers = append(containers, &containerInPod{spec: c})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two places now where we collect all the containers in the pod and their statuses (the other here), this can be error-prone in future refactors. I wonder if this could be unified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I will tune it! :)

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark added the test-plan Add this PR to be manual test plan label Sep 15, 2021
@ChrsMark
Copy link
Member Author

@jsoriano @blakerouse I tested this extensively (manually) and I think is good to go. Would you mind giving it another review?

Next steps will be to add 2e2 tests (in this iteration/release) and verify that we are aligned with #13911 (cc @MichaelKatsoulis ).

@ChrsMark
Copy link
Member Author

/test

Signed-off-by: chrismark <chrismarkou92@gmail.com>
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me.

I would prefer to see the dedot feature just set to default and the options to turn if off removed. But I will leave that up to you to determine.

Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ChrsMark ChrsMark merged commit 46d17b4 into elastic:master Sep 20, 2021
mergify bot pushed a commit that referenced this pull request Sep 20, 2021
v1v added a commit to v1v/beats that referenced this pull request Sep 20, 2021
* upstream/master: (658 commits)
  Add complete k8s metadata through composable provider (elastic#27691)
  Revert "Fix issue where --insecure didn't propogate to Fleet Server ES connection (elastic#27969)" (elastic#27997)
  Remove deprecated kafka fields (elastic#27938)
  [Filebeat] Add Base64 encoded HMAC & UUID template functions to httpjson input (elastic#27873)
  Improve httpjson template function join (elastic#27996)
  Remove kubernetes.container.image alias (elastic#27898)
  [Elastic Agent] Golden files for program tests (elastic#27862)
  [Elastic Agent] Disable modules.d in metricbeat (elastic#27860)
  libbeat/common/seccomp: provide default policy for linux arm64 (elastic#27955)
  Fix logger statement in aws-s3 input (elastic#27982)
  Fix wrong merge (elastic#27976)
  Fix issue where --insecure didn't propogate to Fleet Server ES connection (elastic#27969)
  Forward-port 7.14.2 changelog to master (elastic#27975)
  [Filebeat] Removing duplicate modules (aliases) Observability (elastic#27919)
  Fix path in vagrant windows script (elastic#27966)
  [Filebeat] Removing duplicate modules (aliases) and Cyberark (elastic#27915)
  No changelog for 8.0.0-alpha2 (elastic#27961)
  Add write access to 'url.value' from 'request.transforms'. (elastic#27937)
  Docker: remove deprecated fields (elastic#27933)
  Filebeat: Make all filesets disabled in default configuration (elastic#27762)
  ...
ChrsMark added a commit that referenced this pull request Sep 21, 2021
(cherry picked from commit 46d17b4)

Co-authored-by: Chris Mark <chrismarkou92@gmail.com>
Icedroid pushed a commit to Icedroid/beats that referenced this pull request Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.16.0 Automated backport with mergify Team:Integrations Label for the Integrations team test-plan Add this PR to be manual test plan v7.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add complete k8s metadata through composable provider
4 participants