Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without using fleet generated agent config and without central agent management via fleet #23613

Closed
18 of 19 tasks
blakerouse opened this issue Jan 21, 2021 · 15 comments
Assignees
Labels
meta Team:Elastic-Agent Label for the Agent team Team:Integrations Label for the Integrations team

Comments

@blakerouse
Copy link
Contributor

blakerouse commented Jan 21, 2021

Requirements

We expect Elastic Agent inside of Kubernetes in standalone mode (no connection to fleet-server, not using fleet-server to update agent config) to collect

  • metrics and logs for kubernetes components - master node and the processes it runs, worker node and the processes it runs
  • system metrics and logs of master and worker nodes
  • pod metrics and logs to ES via auto-discovery through dynamic inputs and hints based discovery i.e. nginx running on K8s pods can send their metrics and logs to ES auto-magically as long as the agent policy contains appropriate conditional logic

In addition we will provide

  • K8s integration that operator needs to install to observe K8s master and worker nodes and the infrastructure running K8s cluster, when installed it will display the K8s dashboards we provide today with metricbeat
  • Documentation guide that walks through steps required to enable K8s and pod level observability with elastic agent in standalone mode

Note:
This issue explicitly ignores Fleet Mode, once standalone mode is working and documented then Fleet Mode will be the next target.

Core Issues

Logs & Metrics

Need to be investigated

@blakerouse blakerouse added the Team:Elastic-Agent Label for the Agent team label Jan 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@mukeshelastic
Copy link

@blakerouse thanks so much for writing this issue. I think this covers the K8s control plane observability for worker and master nodes very well. Does it also cover etcd observability? the pod level logs and metrics collection - what use cases does that cover? e.g. If pod runs nginx server then, are we able to collect nginx metrics and logs for it? What does the user need to do to enable that? Also if it's easier, I am happy to divide the issue into two separate ones - one for control plane observability and other for pod level observability.

@mukeshelastic mukeshelastic changed the title [Elastic Agent] [Meta] Kubernetes Standalone Mode [Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without requiring connections to fleet-server Jan 25, 2021
@ChrsMark
Copy link
Member

Hi @mukeshelastic ! If I'm not mistaken, Pod level observability will be covered by Dynamic Inputs and more specifically kubernetes provider (#21480), this is the former Autodiscover feature (hints not included yet). I tried to leverage this functionality at #23618 #23679 so as to dynamically collect metrics from k8s core components that are deployed as pods on master nodes. If we manage to make this use case work then we verify that we can support collecting metrics/logs from every pod dynamically, if it is covered by conditions in Agent's config since we don't support hints yet. I think we can add this as an extra/separate point in this Issue's description.

In this regard, we might need to cover this in our docs soon (#21848) cc: @ph

@ph
Copy link
Contributor

ph commented Feb 2, 2021

@ChrsMark @blakerouse make sure if you find any issues related to the implementation please add them to this meta issue so we can prioritize them.

@ChrsMark
Copy link
Member

ChrsMark commented Feb 18, 2021

@blakerouse about point no1 are we sure that we want it? We don't have something like this enabled by default right now in Filebeat's manifests.

cc: @masci @ruflin

@blakerouse
Copy link
Contributor Author

@ChrsMark Are you talking about System-level log collection? I think it would be great to have. Log collection of the underlying host is just as important as collecting logs about the cluster and the workloads.

@ChrsMark
Copy link
Member

@ChrsMark Are you talking about System-level log collection? I think it would be great to have. Log collection of the underlying host is just as important as collecting logs about the cluster and the workloads.

Yeap, ok then I agree it would be a good insight. Let's add it then.

@mukeshelastic
Copy link

There are few different layers for K8s here - system host that runs an OS, K8s worker node, pods, containers. And getting metrics and logs for each of them is valuable. Assuming your question was about "system host", yes it will be valuable to get those logs for troubleshooting system issues that may cause problems to pods running on it. I am assuming these logs are same as what we'd get by running system integration.

@mukeshelastic mukeshelastic changed the title [Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without requiring connections to fleet-server [Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without using fleet generated agent config or without central agent management via fleet Feb 19, 2021
@mukeshelastic mukeshelastic changed the title [Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without using fleet generated agent config or without central agent management via fleet [Elastic Agent] [Meta] Gathering logs and metrics from Kubernetes without using fleet generated agent config and without central agent management via fleet Feb 19, 2021
@ChrsMark
Copy link
Member

ChrsMark commented Feb 25, 2021

@ruflin @ph @masci Related to ph's comment, since now we have most of the list completed what do you think about backporting the manifest to 7.13?

Sample PR: #24231

@ruflin
Copy link
Member

ruflin commented Feb 25, 2021

++

@ph
Copy link
Contributor

ph commented Feb 25, 2021

@ChrsMark +++ Please lets do it, this is super amazing. thanks for thepush on this.

@ChrsMark
Copy link
Member

👍🏼 Back-ported to 7.x with #24231

@blakerouse
Copy link
Contributor Author

There is only one issue left that is under "Needs Investigation":

I don't know if hints support is required for this to be considered to be complete, but we have made great progress on this issue.

@blakerouse
Copy link
Contributor Author

Being that everything is fixed on this issue minus the hints supports, which has other requirements I am going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Team:Elastic-Agent Label for the Agent team Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

7 participants