-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource limits not affecting scan-vulnerabilityreport pods #572
Comments
That is correct @jeremeb In v0.10.2 resource requests and limits for scan jobs are hardcoded. |
The Trivy plugin sets resource requests and resource limits from the hardcoded struct literal (defaultResourceRequirements). However, it's desirable to configure these values, e.g. increase memory limits, without recompiling Starboard.
|
Was there any decision about this @danielpacak? I think there is a third option. Spark k8s operator accepts pod templates which is a custom yaml partially adjusted by the operator itself. It has the advantage that it's fully customizable: not only the cpu and memory resource limits, but any part can be adjusted (I used it to mount a temporary Adding just memory and cpu limit might be enough for now, but more pod specific customization may require more and more configuration parameter. Using podTemplate would require only one more parameter (eg. keyTrivyPodTemplate) and if it's set it can be used to initially populate pod instead of creating a new On the other hand, it's slightly harder to use this approach from Helm chart. You should add one more configmap (for the pod template) in case of any specific configuration is added, which is slightly more complex compared to adding only one or two environment variables / keys to the existing configmaps... What do you think? |
We haven't decided yet. I think that configuring the whole pod template is an interesting idea. Especially, if one day we'd like to merge Anyway, in scope of this issue I'd just focus on new config key for resource limits without changing the whole configuration model. We can always do that as a follow up. However, the original question holds whether it makes sense to configure resource limits for all scanners (Trivy, Polaris, KubeBench, etc.) or each scanner should have its own resource requirements and limits. In other words, do we make resource requirements part of generic Starboard config or a plugin config? WDYT? |
Having the configuration on generic Starboard level is easier to understand, but IMHO it's more realistic to have a limit per-service bases. It really depends on the exact scanners how many limits/requests should be configured. I am not sure if it's realistic to use one generic limit for all the scanners. But it would be great to have more inputs from @jeremeb, the original reporter: @jeremeb, can you please share the motivation changing the resource limit/requests for the trivy pods? If the default settings (request: 100m/100M limit: 500m/500M) are not enough for huge containers (?), it can be considered to slightly increase the defaults instead of making it customizable. I think the default can be good enough for most of the use-cases. (BTW, I tried out an approach to add it to the trivy config in plain key-value format: elek@53bd652. Didn't find integration tests for helm/charts and as far as I see the |
I agree that configuring resources per service is probably more reasonable and flexible. For example, Trivy will have different resource requirements than Polaris or KubeHunter and it would be nice to configure them separately with sensible defaults. (The defaults for Trivy depend on container images' size and Trivy release. Recent Trivy releases allow scanning Java and Go apps, which requires a bit more memory than previous versions.) That said, I like the approach that you took in 53bd652 I'd maybe use more explicit / self documenting keys:
Regarding the implementation I'm wondering if we can get rid of defaults hardcoded in Go and read them from config? Also maybe we should have a way to skip setting any resources requests & limits if someone did not configure |
Thanks for the suggestion. I updated the branch with using the suggested keys.
This approach makes the code more simple but I am not sure what are the current compatibility guarantees. With moving defaults to helm/k8s resources:
But I moved the defaults to helm chart / k8s resources files and opened the PR. We can continue the discussion about the code there: #639 |
What steps did you take and what happened:
After deploying the starboard-operator-0.5.1 chart, I found that the scan-vulnerabilityreport pods do not have the CPU/Memory resource limits set from the values file. The operator however did have the correct resources set.
What did you expect to happen:
I expected the trivy scan-vulnerabilityreport pods being ran from the starboard-operator to use the cpu/memory resource limits that were set in the values file.
[Miscellaneous information that will assist in solving the issue.]
Environment:
GKE 1.17
Stardboard-Operator Helm Chart 0.5.1
The text was updated successfully, but these errors were encountered: