Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for topology spread constraint #12715

Merged
merged 10 commits into from
Mar 18, 2022

Conversation

stevenchen-db
Copy link
Contributor

@stevenchen-db stevenchen-db commented Mar 9, 2022

Fixes #12639.
Knative serving currently does not allow specifying topologySpreadConstraints in the pod spec as noted by this issue

I tested this by locally building knative and applying it to a k8s cluster. Topology spread constraints were able to work after enabling them through the config-feature config map

Proposed Changes

Release Note


@google-cla
Copy link

google-cla bot commented Mar 9, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

For more information, open the CLA check for this pull request.

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/API API objects and controllers area/networking labels Mar 9, 2022
@knative-prow-robot
Copy link
Contributor

Welcome @stevenchen-db! It looks like this is your first PR to knative/serving 🎉

@knative-prow-robot knative-prow-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 9, 2022
@knative-prow-robot
Copy link
Contributor

Hi @stevenchen-db. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso
Copy link
Member

/ok-to-test

@knative-prow-robot knative-prow-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 10, 2022
Copy link
Member

@dprotaso dprotaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to sign the CLA - see the link above.

Also can you fill out the release note block in your PR body to say we've added this feature behind a flag

TagHeaderBasedRouting: apiConfig.Disabled,
MultiContainer: apiConfig.Disabled,
PodSpecAffinity: apiConfig.Disabled,
PodSpecTopologySpreadConstraints: apiConfig.Disabled,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary? Curious what failed when you didn't have this?

Copy link
Contributor Author

@stevenchen-db stevenchen-db Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I was mostly going off of this as an example PR 20be262. Is it safe to not include PodSpecTopologySpreadConstraints?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are necessary then - can you drop them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Also deleted the line you mentioned below

TagHeaderBasedRouting: cfgmap.Disabled,
MultiContainer: cfgmap.Disabled,
PodSpecAffinity: cfgmap.Disabled,
PodSpecTopologySpreadConstraints: cfgmap.Disabled,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same answer as above

"tag-header-based-routing": "Enabled",
MultiContainer: Enabled,
PodSpecAffinity: Enabled,
PodSpecTopologySpreadConstraints Enabled,
Copy link
Member

@dprotaso dprotaso Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compile error - hence all the warnings/test failures

Suggested change
PodSpecTopologySpreadConstraints Enabled,
PodSpecTopologySpreadConstraints: Enabled,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

PodSpecTopologySpreadConstraints: Enabled,
}),
data: map[string]string{
"kubernetes.podspec-fieldtopologyspreadconstraintsref": "Enabled",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"kubernetes.podspec-fieldtopologyspreadconstraintsref": "Enabled",
"kubernetes.podspec-topologyspreadconstraints": "Enabled",

Copy link
Contributor

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is changing the PodSpec, I believe you'll need to update the schemas as well

@stevenchen-db
Copy link
Contributor Author

Since this PR is changing the PodSpec, I believe you'll need to update the schemas as well

@psschwei I didn't see other things like node affinity or container runtime in hack/schemapatch-config.yaml. Is there any reason why topologySpreadConstraints need to be in this file?

@knative-prow-robot knative-prow-robot added the area/test-and-release It flags unit/e2e/conformance/perf test issues for product features label Mar 14, 2022
@psschwei
Copy link
Contributor

Is there any reason why topologySpreadConstraints need to be in this file?

Since you're behind a feature flag, it doesn't. And since preserveUnknownFields is already set to true, doesn't look like there's anything needed on that front for this one (sorry about that, it's been a while since I looked at the schema job...)

@codecov
Copy link

codecov bot commented Mar 15, 2022

Codecov Report

Merging #12715 (04cebd0) into main (0753bb1) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main   #12715   +/-   ##
=======================================
  Coverage   87.31%   87.31%           
=======================================
  Files         196      196           
  Lines        9750     9754    +4     
=======================================
+ Hits         8513     8517    +4     
+ Misses        952      951    -1     
- Partials      285      286    +1     
Impacted Files Coverage Δ
pkg/apis/config/features.go 96.00% <100.00%> (+0.16%) ⬆️
pkg/apis/serving/fieldmask.go 95.17% <100.00%> (+0.03%) ⬆️
pkg/reconciler/revision/background.go 90.00% <0.00%> (-1.82%) ⬇️
pkg/autoscaler/statserver/server.go 79.61% <0.00%> (+1.94%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0753bb1...04cebd0. Read the comment docs.

@rhuss
Copy link
Contributor

rhuss commented Mar 16, 2022

More a general question when changing the set of supported fields PodSpec that can be used in a RevisionTemplateSpec: What is the process to update the spec in https://github.com/knative/specs/blob/main/specs/serving/knative-api-specification-1.0.md#revisiontemplatespec ?

Is there a place where such changes could be collected for a future update of the spec (like with an OPTIONAL keyword), or can optional fields be ignored in the spec for now and it is a second step to include them in the spec? (but might be good to collect them in a separate list for reference and future updates of the spec)

I suggest updating https://github.com/knative/serving/blob/main/DEVELOPMENT.md?plain=1#L230-L236 in case we need to track those changed fields separately.

@stevenchen-db
Copy link
Contributor Author

@dprotaso I just tested this manually and topology spread constraints worked! Can someone trigger the e2e and formatting tests again?

@dprotaso
Copy link
Member

What is the process to update the spec in https://github.com/knative/specs/blob/main/specs/serving/knative-api-specification-1.0.md#revisiontemplatespec ?

This currently goes through the Knative Trademark Committee.

Is there a place where such changes could be collected for a future update of the spec (like with an OPTIONAL keyword), or can optional fields be ignored in the spec for now and it is a second step to include them in the spec? (but might be good to collect them in a separate list for reference and future updates of the spec)

Changes to the Knative spec apply to all distributions. Fields that are Kubernetes specific will probably never be accepted since K8s is really an implementation detail. And that's ok the OSS implementation can be a super set of the spec.

But since the OSS distribution still values a clear separation of developer/operator concerns some fields may always stay behind a feature flag (like this one).

I suggest updating https://github.com/knative/serving/blob/main/DEVELOPMENT.md?plain=1#L230-L236 in case we need to track those changed fields separately.

This field is covered by the preserveUnknownFields at the PodSpec level

@rhuss
Copy link
Contributor

rhuss commented Mar 17, 2022

And that's ok the OSS implementation can be a super set of the spec.

I then probably don't fully understand the motivation for a spec then. I thought that one important aspect is portability between Knative installations.

So, if someone uses a field that is only supported on the Kubernetes-based reference implementation, then it should be still possible to reuse the same Knative resource definition on a different implementation, which is supposed to ignore this field. So I would expect that all fields in the reference implementation are still mentioned in the spec, but with an OPTIONAL categorization and specifying what should happen when the implementation does not support this field (usually just ignoring it).

In the 1.0 spec this might be covered via:

Additional spec and status fields MAY be provided by particular implementations, however it is expected that most extension will be accomplished via the metadata.labels and metadata.annotations fields, as Knative implementations MAY validate supplied resources against these fields and refuse resources which specify unknown fields.

but a user should really know which fields can safely be used in order to keep its resources portable, so at least a list of all fields that are supported in the Kubernetes-based reference implementation but not in the spec would be very helpful for users being able to keep the Knative resources portable.

@dprotaso
Copy link
Member

dprotaso commented Mar 17, 2022

The feature flag in this PR is off by default so I don't have any concerns about users being confused

@dprotaso
Copy link
Member

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2022
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, stevenchen-db

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2022
@knative-prow-robot knative-prow-robot merged commit b8b7480 into knative:main Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Pod Topology Spread Constraints
5 participants