-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: fix manager memory usage #965
feat: fix manager memory usage #965
Conversation
@ashnamehrotra this is awesome! looks like e2e has a conflict due to artifact names and please sign the DCO |
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ev#899) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com> Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ocs (eraser-dev#923) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Co-authored-by: ashnamehrotra <ashnamehrotra@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#945) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…-dev#919) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xander Grzywinski <xandergr@microsoft.com> Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
…ser-dev#921) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xander Grzywinski <xandergr@microsoft.com> Co-authored-by: ashnamehrotra <ashnamehrotra@gmail.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
8d8b525
to
fbc4c27
Compare
SelectorsByObject: cache.SelectorsByObject{ | ||
// to watch eraser pods | ||
&corev1.Pod{}: { | ||
Field: fields.OneTermEqualSelector("metadata.namespace", utils.GetNamespace()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still making a cluster scoped watch and having API server filter them down. You could confirm that one way or the other by reducing the RBAC of the eraser components to use role bindings to properly scope the access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RBAC is already scoped to the namespace for the manager role, but removing the field to filter by namespace increases the memory again. I think that could be because for the manager we have ClusterRole instead of Role, but we can't change that since we need to watch ImageJobs/ImageLists in default and eraser pods in eraser-system.
It was the same problem testing with MultiNamespacedCacheBuilder and keeping only the eraser-system and default namespaces, it didn't have an affect on memory consumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should create separate cluster roles/roles/cluster roles bindings/role bindings to express the exact permissions of each component. There is no requirement that everything be done via a single cluster role/cluster role binding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://youtu.be/Nw1ymxcLIDI is a good resource to understand RBAC fundamentals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you this was really helpful! I changed the role to create a separate Role and ClusterRole
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, RBAC LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster role in manifest_staging/charts/eraser/templates/eraser-imagejob-pods-cluster-role-clusterrole.yaml
is empty, should and the associated binding it be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup we should only need the service account
- list | ||
- patch | ||
- update | ||
- watch | ||
- apiGroups: | ||
- eraser.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the eraser resources, does your controller need access to create
and similar permissions on the resource itself (i.e. not /status
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does for ImageJob, but not for ImageList. Removed create/delete/update/patch permissions for ImageList. Removed update/patch permissions for ImageJob.
metadata: | ||
name: manager-rolebinding | ||
namespace: system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be eraser-system
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by default the roles are in "system" and during deployment it changes to "eraser-system" since the namespace is configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubebuilder/kustomize automatically prefixes these
namePrefix: eraser- |
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
@@ -189,9 +189,9 @@ func checkNodeFitness(pod *corev1.Pod, node *corev1.Node) bool { | |||
} | |||
|
|||
//+kubebuilder:rbac:groups=eraser.sh,resources=imagejobs,verbs=get;list;watch;create;update;patch;delete | |||
//+kubebuilder:rbac:groups="",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete | |||
//+kubebuilder:rbac:groups="",namespace="system",resources=podtemplates,verbs=get;list;watch;create;update;patch;delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make sure system doesn't break since kubebuilder adds that prefix in generation but not sure if it applies here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this only generates the files under rbac/ which all use system. Then in manifest staging/ it is replaced to eraser-system through the templates.
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
5ef1e9d
to
d841589
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! LGTM
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Fabian Gonzalez <fabiangonz98@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> Co-authored-by: Fabian Gonzalez <fabiangonz98@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Xander Grzywinski <xandergr@microsoft.com>
What this PR does / why we need it:
Utilizes
cache.BuilderWithOptions
in the manager to free up space from watching unnecessary resources and changes ImageJob controller to usenodeNames
to retrieve each node rather than storing all nodes in a list.Which issue(s) this PR fixes (optional, using
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when the PR gets merged):Fixes #798
Special notes for your reviewer:
Eraser load test results for cluster with 500 vuln images, 50 nodes, cpu request unset, and high usage daemonset deployed
Before changes:
MAX CPU Manager: 973.889567 m
MAX MEM Manager: 351.07 Mi
After changes:
MAX CPU Manager: 441.082336 m
MAX MEM Manager: 32.94 Mi