Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting Queue Proxy resources in ConfigMap #8195

Merged
merged 1 commit into from
Jun 9, 2020

Conversation

julz
Copy link
Member

@julz julz commented Jun 4, 2020

Fixes #8012.

Proposed Changes

  • defaults request.cpu to 25m for backwards compatibility
  • all other requests and limits also maintain current defaults
    (i.e. nil/unspecified/inherit kubernetes)
  • annotation continues to work to override based on user container resources

Release Note

Operators can now set Queue Proxy resource requests/limits in the config-deployment.yaml config map

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jun 4, 2020
@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/API API objects and controllers labels Jun 4, 2020
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julz: 0 warnings.

In response to this:

Fixes #8012.

Proposed Changes

  • defaults request.cpu to 25m for backwards compatibility
  • all other requests and limits also maintain current defaults
    (i.e. nil/unspecified/inherit kubernetes)
  • annotation continues to work to override based on user container resources

Release Note

Operators can now set Queue Proxy resource requests/limits in the config-deployment.yaml config map

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/deployment/config.go Show resolved Hide resolved
rev: revision("bar", "foo",
withContainers(containers),
withContainerConcurrency(1)),
cc: deployment.Config{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename cc to dc?

Copy link
Member Author

@julz julz Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good catch

Comment on lines +54 to +56
got.QueueSidecarCPULimit = nil
got.QueueSidecarMemoryRequest, got.QueueSidecarMemoryLimit = nil, nil
got.QueueSidecarEphemeralStorageRequest, got.QueueSidecarEphemeralStorageLimit = nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we actually set those as defaults anyway?

Copy link
Member Author

@julz julz Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe eventually, but for this PR the feature proposal specifically says we won’t change current defaults (which are empty ie inherit k8s/no limit), we’ll just let operators override that if they want.

Comment on lines 54 to 56
# queueSidecarCPULimit is the limits.cpu to set for the queue proxy sidecar container.
# If omitted, no value is specified and the system default is used.
queueSidecarCPULimit: "25m"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be larger?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, seems reasonable that it should be bigger than request, bumped it to 75m

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. actually, bumped it some more :D. Might as well make it 1000m = 1 full cpu and have it match default container resources. It's only an example anyway but limit is a hard limit so better to have the example be on the high side imo.

)

func defaultConfig() *Config {
return &Config{
ProgressDeadline: ProgressDeadlineDefault,
RegistriesSkippingTagResolving: sets.NewString("ko.local", "dev.local"),
QueueSidecarCPURequest: &QueueSidecarCPURequestDefault,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to set all of them. Since that's what the CM above implies.

Copy link
Member Author

@julz julz Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CM example doc explicitly says that all fields except this one (to avoid changing current defaults) default to unset (ie no limit). This is same behaviour - again other than cpu to avoid changing current default - as defaults.yaml has for user container resource defaults: if you don’t explicitly set, we default to no request/limit, but we still have examples of what valid values look like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I thought for QP we do set them, vs what we do for UserContainer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah I think eventually it'd make sense to, but tbh I don't know what the correct values to default to would be for all workloads yet so I limited the initial feature proposal to letting operators set, but maintaining existing behaviour (i.e unset/inherit/infinite) if they don't. Hopefully we can experiment a bit and determine good defaults once this lands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG

Copy link
Contributor

@shashwathi shashwathi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
question: is there need for validation of resource limit/requests set in cm?

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2020
@julz
Copy link
Member Author

julz commented Jun 5, 2020

@savitaashture we don’t currently validate similar fields for setting default user container resources so I think it’s ok if we don’t validate these here.

@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2020
Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/assign @dprotaso @mattmoor

@knative-prow-robot knative-prow-robot added lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 5, 2020
- defaults request.cpu to 25m for backwards compatibility
- all other requests and limits maintain current default (i.e.
  nil/unspecified/inherit kubernetes default)
- annotation continues to work to override
@knative-prow-robot knative-prow-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 8, 2020
@julz
Copy link
Member Author

julz commented Jun 8, 2020

rebased

@dprotaso
Copy link
Member

dprotaso commented Jun 9, 2020

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 9, 2020
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, julz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 9, 2020
@knative-test-reporter-robot

The following jobs failed:

Test name Triggers Retries
pull-knative-serving-unit-tests 0/3

Failed non-flaky tests preventing automatic retry of pull-knative-serving-unit-tests:

pkg/reconciler/revision.TestNewRevisionCallsSyncHandler

@julz
Copy link
Member Author

julz commented Jun 9, 2020

looks like a flake: #7852

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Proposal: Allow operators to set defaults for Queue Proxy resources
9 participants