Can we set a default CPU/Memory Limit/Request for queue_proxy container #5829

yuzliu · 2019-10-18T14:33:31Z

In what area(s)?

/area API

Ask your question here:

Hi guys, is there a way for knative serving to set a default value for QueueSideCarResourcePercentageAnnotation:

Line 71 in ec457ad

    
           QueueSideCarResourcePercentageAnnotation = "queue.sidecar." + GroupName + "/resourcePercentage"

?

We can not apply resourcequota to our k8s cluster because the queue_proxy container does not have both CPU/Memory Limit configured. We have multiple components depend on knative serving and I think it makes sense that knative serving set a default value and users can mutate it if we want. Thought?

SugandhaAgrawal · 2019-10-21T07:30:33Z

There exists #4151

dgerd · 2019-10-23T21:23:09Z

You should just be able to apply the following annotation to the metadata of your Revision Template within your Knative Service.

For example, in your Knative Service Spec

spec:
  template:
    metadata:
      annotations:
        queue.sidecar.serving.knative.dev/resourcePercentage: "10"
    spec:
      containers:
      - image: gcr.io/google-samples/microservices-demo/shippingservice:v0.1.2

yuzliu · 2019-10-24T02:18:43Z

Thank you guys for the reply! I do know that I can use this annotation but just wondering if knative can set a reasonable default value for users.

mattmoor · 2019-10-29T14:05:39Z

What's unreasonable about the current value?

TheEvilCoder42 · 2019-11-08T12:37:33Z

@mattmoor Let there be 50 idling concurrent Pods running, that's already 25m x 50 = 1250m wasted CPU resource usage.

Sadly annotations like: queue.sidecar.serving.knative.dev/requestCpu: "10m" doesn't seem to work (and as I saw the min boundary is 25m but setting the annotation higher doesn't work either)

richard2006 · 2020-01-19T13:53:08Z

I agree with @TheEvilCoder42 , can we support set queue requestCpu smaller，or even zero

vagababov · 2020-01-19T20:20:24Z

@TheEvilCoder42 50 running idle pods defeats the purpose of knative, i.e. scaling.
Annotation that we have does not specify the absolute value, but rather percentage, so you need 10, rather than 10m [also yeah, you need resourcePercentage].

knative-housekeeping-robot · 2020-04-19T00:00:47Z

Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/lifecycle stale

julz · 2020-05-14T14:51:29Z

bumping this up: seems like we're relying on users to set QueueSideCarResourcePercentageAnnotation, if not set then we set a default cpu request but no mem request/limit and no cpu limit (https://github.com/knative/serving/blob/master/pkg/reconciler/revision/resources/queue.go#L83).

This means the scheduler won't be accounting for all the queue proxies when scheduling pods, which - unless Im missing something - could lead to OOM errors, and will break on any namespaces with a ResourceQuota set up unless they set a default limit on the namespace (which would work, but might well be much higher than the QP needs) or every user sets this annotation (which means, for a start, none of the samples will work out of the box in such an environment).

Does anyone have an objection to setting a default mem/cpu request/limit in defaults.yaml for this? If not I'll PR one.

markusthoemmes · 2020-05-14T14:57:19Z

FWIW, we're hitting issues here too when using LimitRanges. If we're adding a setting here, let's please make sure one can completely unset the default too.

In fact: Should the default already be... nothing? Feels weird that we're setting a default request on the queue-proxy but not on the user container 🤔

julz · 2020-05-14T14:59:10Z

I think the problem is there's currently no default, but ResourceQuota/LimitRange require the Queue Proxy to have it set. So there's no way for any of the samples to work out of the box on a namespace with ResourceQuota/LimitRange set unless the user adds this annotation.

markusthoemmes · 2020-05-14T15:05:48Z

Should this be surfaced to the API WG maybe? Work out a small proposal how better integration with LimitRange etc. looks like?

julz · 2020-05-14T15:10:59Z

yeah that might make sense, I can experiment a bit with current behaviour then bring up at next api wg meeting

mattmoor · 2020-05-14T19:29:48Z

/remove-lifecycle stale

Yeah I don't like what we have now, I think the current value is another thing that dates back to the early VPA work, the annotation (and clamping) is newer, but I think this whole area could use some focused investigation and effort.

I think @vagababov also suspected this as a source of 5xx under higher load (e.g. ~60k QPS)

I'd love to see someone deeply investigate the resource usage of the queue proxy and come up with a proposal for how we manage its resources. Any takers?

cc @tcnghia too

julz · 2020-05-20T14:37:21Z

OK, to get us started on this I spun up a basic proposal for operator-set defaults req/limits and added to API WG meeting agenda.

yuzisun · 2020-05-25T20:40:14Z

FYI, we are hitting a critical tail latency issue even for a very simple use case with low qps #8057

github-actions · 2020-08-23T20:43:16Z

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen.Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

julz · 2020-08-27T10:24:15Z

Just for anyone searching here, operator-settable Queue Proxy requests/limits landed in #8195.

evankanderson · 2021-03-22T06:46:41Z

It looks like this is actually done?

/close

knative-prow-robot · 2021-03-22T06:46:46Z

@evankanderson: Closing this issue.

In response to this:

It looks like this is actually done?

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yuzliu added the kind/question Further information is requested label Oct 18, 2019

yuzliu mentioned this issue Oct 18, 2019

Container resource kserve/kserve#454

Merged

eallred-google added this to the Needs Triage milestone Oct 23, 2019

knative-prow-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2020

knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 14, 2020

julz mentioned this issue May 20, 2020

Feature Proposal: Allow operators to set defaults for Queue Proxy resources #8012

Closed

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 23, 2020

github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2021

knative-prow-robot closed this as completed Mar 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we set a default CPU/Memory Limit/Request for queue_proxy container #5829

Can we set a default CPU/Memory Limit/Request for queue_proxy container #5829

yuzliu commented Oct 18, 2019 •

edited

Loading

SugandhaAgrawal commented Oct 21, 2019

dgerd commented Oct 23, 2019

yuzliu commented Oct 24, 2019

mattmoor commented Oct 29, 2019

TheEvilCoder42 commented Nov 8, 2019

richard2006 commented Jan 19, 2020

vagababov commented Jan 19, 2020 •

edited

Loading

knative-housekeeping-robot commented Apr 19, 2020

julz commented May 14, 2020

markusthoemmes commented May 14, 2020

julz commented May 14, 2020

markusthoemmes commented May 14, 2020

julz commented May 14, 2020

mattmoor commented May 14, 2020

julz commented May 20, 2020

yuzisun commented May 25, 2020

github-actions bot commented Aug 23, 2020

julz commented Aug 27, 2020

evankanderson commented Mar 22, 2021

knative-prow-robot commented Mar 22, 2021

Can we set a default CPU/Memory Limit/Request for queue_proxy container #5829

Can we set a default CPU/Memory Limit/Request for queue_proxy container #5829

Comments

yuzliu commented Oct 18, 2019 • edited Loading

In what area(s)?

Ask your question here:

SugandhaAgrawal commented Oct 21, 2019

dgerd commented Oct 23, 2019

yuzliu commented Oct 24, 2019

mattmoor commented Oct 29, 2019

TheEvilCoder42 commented Nov 8, 2019

richard2006 commented Jan 19, 2020

vagababov commented Jan 19, 2020 • edited Loading

knative-housekeeping-robot commented Apr 19, 2020

julz commented May 14, 2020

markusthoemmes commented May 14, 2020

julz commented May 14, 2020

markusthoemmes commented May 14, 2020

julz commented May 14, 2020

mattmoor commented May 14, 2020

julz commented May 20, 2020

yuzisun commented May 25, 2020

github-actions bot commented Aug 23, 2020

julz commented Aug 27, 2020

evankanderson commented Mar 22, 2021

knative-prow-robot commented Mar 22, 2021

yuzliu commented Oct 18, 2019 •

edited

Loading

vagababov commented Jan 19, 2020 •

edited

Loading