Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only watch metadata for ReplicaSets in K8s provider #5699

Merged
merged 1 commit into from
Oct 8, 2024

Conversation

swiatekm
Copy link
Contributor

@swiatekm swiatekm commented Oct 4, 2024

What does this PR do?

Use a metadata watcher for ReplicaSets in the K8s provider. The only data we need from ReplicaSets are their name and OwnerReferences, which are used to connect Pods to Deployments and DaemonSets.

Why is it important?

This significantly reduces both memory used to store ReplicaSet data, and temporary allocations to process update events from the API Server. See the linked issue for more detailed information.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added an entry in ./changelog/fragments using the changelog tool

How to test this PR locally

Showing that the change doesn't cause a regression requires starting agent in K8s and enabling deployment metadata in the K8s provider, and then using it in a templated input definition - this is simplest with filebeat.

Demonstrating the memory consumption improvement is more involved - you need to create a significant amount of ReplicaSets in the cluster. I saw a noticable difference at ~7500, see #5623 for more details.

Related issues

@swiatekm swiatekm added enhancement New feature or request backport-8.15 Automated backport to the 8.15 branch with mergify backport-8.x Automated backport to the 8.x branch with mergify labels Oct 4, 2024
@swiatekm swiatekm requested review from a team as code owners October 4, 2024 10:58
@swiatekm swiatekm force-pushed the k8sprovider-replicaset-onlymeta branch 2 times, most recently from c4d31d9 to f02dc75 Compare October 4, 2024 11:11
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 4, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@swiatekm swiatekm force-pushed the k8sprovider-replicaset-onlymeta branch from f02dc75 to 3979fa7 Compare October 4, 2024 11:31
SyncTimeout: cfg.SyncPeriod,
Namespace: cfg.Namespace,
HonorReSyncs: true,
}, nil, metadata.RemoveUnnecessaryReplicaSetData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the whole idea here is based on this fucntion https://github.com/elastic/elastic-agent-autodiscover/pull/111/files#diff-745348e532593174e8280a273af14d0a76f379bbeb48e782d66c653e4e36d994R103 that computes only the needed metadata.

Quick question: Why we needed to create a specific watcher NewNamedMetadataWatcher and we could not retrieve the same info with the old client?

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is not only the transform func, another essential bit is the use of PartialObjectMetadata, which a metadata-based client is requesting for this type from the API server. A relevant comment can be found here 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old client operates on whole resources. So it always fetches the entire ReplicaSet resource, and the informer gets an update whenever anything in that resource changes - for example when it scales up or down. For each such update, we need to deserialize the whole resource into memory, and then we only make use of the name and owner references. In a busy cluster, this adds up to a lot of memory churn that is completely unnecessary.

The new watcher only subscribes to changes to metadata, so it sidesteps the problem. However, to achieve this, we need a special K8s client which only fetches metadata, and a special informer that operates on PartialObjectMetadata structs. This is what the new watcher is for.

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@swiatekm
Copy link
Contributor Author

swiatekm commented Oct 4, 2024

/test

@swiatekm swiatekm force-pushed the k8sprovider-replicaset-onlymeta branch from 3979fa7 to 6fcd64f Compare October 4, 2024 16:51
Copy link
Contributor

mergify bot commented Oct 7, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b k8sprovider-replicaset-onlymeta upstream/k8sprovider-replicaset-onlymeta
git merge upstream/main
git push upstream k8sprovider-replicaset-onlymeta

Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.
@swiatekm swiatekm force-pushed the k8sprovider-replicaset-onlymeta branch from c794854 to 4c14378 Compare October 7, 2024 18:22
@swiatekm swiatekm merged commit 05f9e81 into main Oct 8, 2024
14 checks passed
@swiatekm swiatekm deleted the k8sprovider-replicaset-onlymeta branch October 8, 2024 08:03
mergify bot pushed a commit that referenced this pull request Oct 8, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)
mergify bot pushed a commit that referenced this pull request Oct 8, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)
swiatekm added a commit that referenced this pull request Oct 8, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)
swiatekm added a commit that referenced this pull request Oct 8, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)

Co-authored-by: Mikołaj Świątek <mail@mikolajswiatek.com>
swiatekm added a commit that referenced this pull request Oct 9, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)
swiatekm added a commit that referenced this pull request Oct 9, 2024
Use a metadata informer for ReplicaSets.
Use a transform function to drop all data except for the owner
references, which we need to find the Deployment name.

(cherry picked from commit 05f9e81)

Co-authored-by: Mikołaj Świątek <mail@mikolajswiatek.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.15 Automated backport to the 8.15 branch with mergify enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants