Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fix manager memory usage #965

Merged
merged 53 commits into from
Feb 1, 2024

Conversation

ashnamehrotra
Copy link
Contributor

What this PR does / why we need it:
Utilizes cache.BuilderWithOptions in the manager to free up space from watching unnecessary resources and changes ImageJob controller to use nodeNames to retrieve each node rather than storing all nodes in a list.

Which issue(s) this PR fixes (optional, using fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when the PR gets merged):
Fixes #798

Special notes for your reviewer:
Eraser load test results for cluster with 500 vuln images, 50 nodes, cpu request unset, and high usage daemonset deployed

Before changes:
MAX CPU Manager: 973.889567 m
MAX MEM Manager: 351.07 Mi

After changes:
MAX CPU Manager: 441.082336 m
MAX MEM Manager: 32.94 Mi

@sozercan
Copy link
Member

@ashnamehrotra this is awesome! looks like e2e has a conflict due to artifact names and please sign the DCO

ashnamehrotra and others added 27 commits January 25, 2024 09:43
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ev#899)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ocs (eraser-dev#923)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#945)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…-dev#919)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#921)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
dependabot bot and others added 12 commits January 25, 2024 09:50
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
SelectorsByObject: cache.SelectorsByObject{
// to watch eraser pods
&corev1.Pod{}: {
Field: fields.OneTermEqualSelector("metadata.namespace", utils.GetNamespace()),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still making a cluster scoped watch and having API server filter them down. You could confirm that one way or the other by reducing the RBAC of the eraser components to use role bindings to properly scope the access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC is already scoped to the namespace for the manager role, but removing the field to filter by namespace increases the memory again. I think that could be because for the manager we have ClusterRole instead of Role, but we can't change that since we need to watch ImageJobs/ImageLists in default and eraser pods in eraser-system.

It was the same problem testing with MultiNamespacedCacheBuilder and keeping only the eraser-system and default namespaces, it didn't have an affect on memory consumption.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should create separate cluster roles/roles/cluster roles bindings/role bindings to express the exact permissions of each component. There is no requirement that everything be done via a single cluster role/cluster role binding.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://youtu.be/Nw1ymxcLIDI is a good resource to understand RBAC fundamentals.

Copy link
Contributor Author

@ashnamehrotra ashnamehrotra Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you this was really helpful! I changed the role to create a separate Role and ClusterRole

@sozercan sozercan requested a review from enj January 26, 2024 18:26
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Copy link

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, RBAC LGTM.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster role in manifest_staging/charts/eraser/templates/eraser-imagejob-pods-cluster-role-clusterrole.yaml is empty, should and the associated binding it be deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup we should only need the service account

- list
- patch
- update
- watch
- apiGroups:
- eraser.sh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the eraser resources, does your controller need access to create and similar permissions on the resource itself (i.e. not /status)?

Copy link
Contributor Author

@ashnamehrotra ashnamehrotra Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does for ImageJob, but not for ImageList. Removed create/delete/update/patch permissions for ImageList. Removed update/patch permissions for ImageJob.

metadata:
name: manager-rolebinding
namespace: system
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be eraser-system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default the roles are in "system" and during deployment it changes to "eraser-system" since the namespace is configurable

Copy link
Member

@sozercan sozercan Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubebuilder/kustomize automatically prefixes these

namePrefix: eraser-

Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
@@ -189,9 +189,9 @@ func checkNodeFitness(pod *corev1.Pod, node *corev1.Node) bool {
}

//+kubebuilder:rbac:groups=eraser.sh,resources=imagejobs,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups="",namespace="system",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure system doesn't break since kubebuilder adds that prefix in generation but not sure if it applies here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only generates the files under rbac/ which all use system. Then in manifest staging/ it is replaced to eraser-system through the templates.

Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Copy link
Member

@sozercan sozercan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! LGTM

@ashnamehrotra ashnamehrotra merged commit 1e4401b into eraser-dev:main Feb 1, 2024
178 checks passed
ashnamehrotra added a commit to ashnamehrotra/eraser that referenced this pull request Feb 1, 2024
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
ashnamehrotra added a commit that referenced this pull request Feb 1, 2024
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix manager memory usage
4 participants