-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Agent standalone k8s manifest #23679
Add Agent standalone k8s manifest #23679
Conversation
Signed-off-by: chrismark <chrismarkou92@gmail.com>
Pinging @elastic/integrations (Team:Integrations) |
💚 Build Succeeded
Expand to view the summary
Build stats
Trends 🧪❕ Flaky test reportNo test was executed to be analysed. |
@ChrsMark Lets file an issue for the dynamic inputs piece. You are using it correctly here so it should work, we need to track down and fix why it is not working on the Agent side. No way to disable dynamic inputs in Agent either, the |
Signed-off-by: chrismark <chrismarkou92@gmail.com>
@ruflin @blakerouse Heads-up on this, after pulling the latest changes from #23886 (thanks @blakerouse!) it finally works and collects metrics from all k8s datastreams. This one also proves that @blakerouse can you share your thoughts here plz? After this one is in we can add parts for system metrics and container logs (better to split them in different PRs) |
Signed-off-by: chrismark <chrismarkou92@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Glad to see this is working.
@ChrsMark I think with the new hostfs work your did on the inputs, I think gather system metrics from the nodes should be possible. |
Yeap, this will be the next one coming. |
Merging this one and let's iterate on it with follow-up PRs to add more functionality. |
dnsPolicy: ClusterFirstWithHostNet | ||
containers: | ||
- name: elastic-agent | ||
image: docker.elastic.co/beats/elastic-agent:7.12.0-SNAPSHOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split this manifest on different files and use the %VERSION%
placeholder as done in other beats.
runAsUser: 0 | ||
resources: | ||
limits: | ||
memory: 200Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this pod is going to run metricbeat and filebeat we may need to increase this limit. This is the limit used by default for a single beat.
- >- | ||
${ES_HOST} | ||
username: ${ES_USERNAME} | ||
password: ${ES_PASSWORD} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add settings also for cloud id and auth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this is how standalone is produced by cloud. But sure we can update it (not sure if cloud settings are available on agent though)
k8s-app: elastic-agent | ||
data: | ||
agent.yml: |- | ||
id: ef9cc740-5bf0-11eb-8b51-39775155c3f5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this id? is it ok to have the same for all agents running in the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blake do you think this would be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment from working PR to track: #23938 (comment)
data_stream: | ||
dataset: kubernetes.pod | ||
type: metrics | ||
metricsets: | ||
- pod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, there are several metricsets with the same configuration, but they are defined as different streams.
Is this because they need to have different data_stream.dataset
?
Is this translated to one module configuration for each metricset in Metricbeat config?
This can be relevant for possible uses of logic at the module level, e.g. the cloudfoundry
module keeps a single connection at the module level for many metricsets, and we we could make a similar thing with the state_*
metricsets of kubernetes to avoid making the same big request per metricset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeap, that's how standalone config looks like after being exported from Fleet UI. Regarding fetch optimisation this is a known enhancement filed at elastic/integrations#601 (point no2)
…-arm * upstream/master: [CI] install docker-compose with retry (elastic#24069) Add nodes to filebeat-kubernetes.yaml ClusterRole - fixes elastic#24051 (elastic#24052) updating manifest files for filebeat threatintel module (elastic#24074) Add Zeek Signatures (elastic#23772) Update Beats to ECS 1.8.0 (elastic#23465) Support running Docker logging plugin on ARM64 (elastic#24034) Fix ec2 metricset fields.yml and add integration test (elastic#23726) Only build targz and zip versions of Beats if PACKAGES is set in agent (elastic#24060) [Filebeat] Add field definitions for known Netflow/IPFIX vendor fields (elastic#23773) [Elastic Agent] Enroll with Fleet Server (elastic#23865) [Filebeat] Convert logstash logEvent.action objects to strings (elastic#23944) [Ingest Management] Fix reloading of log level for services (elastic#24055) Add Agent standalone k8s manifest (elastic#23679)
…dows-7 * upstream/master: (332 commits) Use ECS v1.8.0 (elastic#24086) Add support for postgresql csv logs (elastic#23334) [Heartbeat] Refactor config system (elastic#23467) [CI] install docker-compose with retry (elastic#24069) Add nodes to filebeat-kubernetes.yaml ClusterRole - fixes elastic#24051 (elastic#24052) updating manifest files for filebeat threatintel module (elastic#24074) Add Zeek Signatures (elastic#23772) Update Beats to ECS 1.8.0 (elastic#23465) Support running Docker logging plugin on ARM64 (elastic#24034) Fix ec2 metricset fields.yml and add integration test (elastic#23726) Only build targz and zip versions of Beats if PACKAGES is set in agent (elastic#24060) [Filebeat] Add field definitions for known Netflow/IPFIX vendor fields (elastic#23773) [Elastic Agent] Enroll with Fleet Server (elastic#23865) [Filebeat] Convert logstash logEvent.action objects to strings (elastic#23944) [Ingest Management] Fix reloading of log level for services (elastic#24055) Add Agent standalone k8s manifest (elastic#23679) [Metricbeat][Kubernetes] Extend state_node with more conditions (elastic#23905) [CI] googleStorageUploadExt step (elastic#24048) Check fields are documented for aws metricsets (elastic#23887) Update go-concert to 0.1.0 (elastic#23770) ...
What does this PR do?
This PR adds k8s manifest for running Elastic Agent in standalone mode with the k8s integration enabled by default. This one deploys Agent as Daemonset Pods on all k8s nodes and as Deployment Pod on the cluster. Deamonset Pods are responsible for collecting metrics from node's
kubelet
API,kubeproxy
metrics and try to autodiscover k8s Scheduler Pod and k8s Controller Manager Pod (which are deployed on master node(s)) and start collecting from them dynamically using the respective metricsets. Deployment pod is responsible for collecting cluster wide metrics fromkube_state_metrics
service running on the cluster.@blakerouse @masci @ph @ruflin I would love your feedback here.
Disclaimer: The manifest works if we disable the dynamic inputs part. Find full information about the issues in the bottom of this description: #23685
How to test this PR locally
kind create cluster --config kind-mutly.yaml
2. Uncomment the
scheduler
andcontrollermanager
config section and deploy Agent:kubectl apply -f elastic-agent-standalone-kubernetes.yml
3. Verify that all data streams ship data:
Related issues
Open Issues
Dynamic inputs setup to automatically discover
scheduler
andcontrollermanager
Pods does not completely work right now and we get the following error:Converting
${NODE_NAME}
placeholders to${env.NODE_NAME}
does not fix the problem and even if we remove all other datastream configs and leave only the dynamic one it still gives the error:In addition, if we remove the dynamic inputs part and have
${env.NODE_NAME}
we still get the same error.In this, there might be a bug in Agent which does not allow us to combine these 2 configuration approaches.
After deploying the manifests the package is not automaticaly installed and requires the user to manually install it from Fleet UI. This is already known but I'm putting it here for reference.