-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only watch metadata for ReplicaSets in K8s provider #5699
Conversation
c4d31d9
to
f02dc75
Compare
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
f02dc75
to
3979fa7
Compare
SyncTimeout: cfg.SyncPeriod, | ||
Namespace: cfg.Namespace, | ||
HonorReSyncs: true, | ||
}, nil, metadata.RemoveUnnecessaryReplicaSetData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the whole idea here is based on this fucntion https://github.com/elastic/elastic-agent-autodiscover/pull/111/files#diff-745348e532593174e8280a273af14d0a76f379bbeb48e782d66c653e4e36d994R103 that computes only the needed metadata.
Quick question: Why we needed to create a specific watcher NewNamedMetadataWatcher
and we could not retrieve the same info with the old client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea is not only the transform func, another essential bit is the use of PartialObjectMetadata, which a metadata-based client is requesting for this type from the API server. A relevant comment can be found here 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old client operates on whole resources. So it always fetches the entire ReplicaSet
resource, and the informer gets an update whenever anything in that resource changes - for example when it scales up or down. For each such update, we need to deserialize the whole resource into memory, and then we only make use of the name and owner references. In a busy cluster, this adds up to a lot of memory churn that is completely unnecessary.
The new watcher only subscribes to changes to metadata, so it sidesteps the problem. However, to achieve this, we need a special K8s client which only fetches metadata, and a special informer that operates on PartialObjectMetadata structs. This is what the new watcher is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test |
3979fa7
to
6fcd64f
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name.
c794854
to
4c14378
Compare
Quality Gate passedIssues Measures |
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81)
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81)
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81)
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81) Co-authored-by: Mikołaj Świątek <mail@mikolajswiatek.com>
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81)
Use a metadata informer for ReplicaSets. Use a transform function to drop all data except for the owner references, which we need to find the Deployment name. (cherry picked from commit 05f9e81) Co-authored-by: Mikołaj Świątek <mail@mikolajswiatek.com>
What does this PR do?
Use a metadata watcher for ReplicaSets in the K8s provider. The only data we need from ReplicaSets are their name and OwnerReferences, which are used to connect Pods to Deployments and DaemonSets.
Why is it important?
This significantly reduces both memory used to store ReplicaSet data, and temporary allocations to process update events from the API Server. See the linked issue for more detailed information.
Checklist
./changelog/fragments
using the changelog toolHow to test this PR locally
Showing that the change doesn't cause a regression requires starting agent in K8s and enabling deployment metadata in the K8s provider, and then using it in a templated input definition - this is simplest with filebeat.
Demonstrating the memory consumption improvement is more involved - you need to create a significant amount of ReplicaSets in the cluster. I saw a noticable difference at ~7500, see #5623 for more details.
Related issues