Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more details about host port ownership #618

Closed
wants to merge 1 commit into from

Conversation

danwinship
Copy link
Contributor

This clarifies our usage of host network ports on nodes.

As currently written, this imposes the restriction on OCP that after 4.8, we will not claim any new ports (on workers) outside the 9000-9999 and 29000-29999 ranges unless we also provide a configuration option to move the new service to a different port to avoid conflicts with customer pods. This seems to be the only plausible way to avoid port conflicts with customer pods on upgrade. (Well, the other possibility is that we say customers are restricted to a specific range rather than that we are restricted to a certain range. But given that using ports outside the reserved range is complicated for us anyway since it means opening up more firewall ports, it seems to make more sense to just say that we'll only use the reserved ranges.)

(If we (network team) agree on this then we will need to get buy-in from other teams too before moving forward.)

Assuming we agree on the "rules" set forth here, we should add e2e tests to enforce OCP compliance, and prometheus alerts to enforce customer compliance.

@dcbw @russellb @knobunc @trozet @abhat

@dhellmann
Copy link
Contributor

@stbenjam @hardys @zaneb @sadasu FYI. IIRC, Ironic is running outside of these ranges today.

@squeed
Copy link
Contributor

squeed commented Feb 1, 2021

Can we really claim that masters are entirely off-limits? It's plausible that a customer would like to deploy a cluster-wide auditing service, say.

@stbenjam
Copy link
Member

stbenjam commented Feb 9, 2021

@stbenjam @hardys @zaneb @sadasu FYI. IIRC, Ironic is running outside of these ranges today.

The PR description says "after 4.8, we will not claim any new ports (on workers) outside the 9000-9999 and 29000-29999." We've already claimed the ports in the port registry. We also only run on the control plane.

I wouldn't be opposed to moving things, though, but probably out of scope for this enhancement. It might simplify the firewall rules since the installer does need to talk to Ironic on the bootstrap, for example.

on a node where there would be a port conflict.
Other than the reserved ranges and the other ports listed below, host
ports on worker nodes are available for use by customers. (But all
host ports on masters are reserved for OCP use.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to address what happens on single-node or compact clusters where control plane and workers are the same host?

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from danwinship after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danwinship
Copy link
Contributor Author

Can we really claim that masters are entirely off-limits? It's plausible that a customer would like to deploy a cluster-wide auditing service, say.

But would they necessarily need to claim a host port to do that?

Do we need to address what happens on single-node or compact clusters where control plane and workers are the same host?

ok, so I clarified that:

Additionally, customers may not claim any host ports on the OCP
masters. (Doing so may result in a cluster that cannot be upgraded,
due to port conflicts.)

So in other words, we will not actually go out of our way to prevent people from doing this, it's just that if they do it, and they pick a port that we also wanted to use in the next release, then trying to upgrade would fail.

We could make masters work the same way as workers, but that puts more limits on us in the future. (Of the current reserved host ports, more than half are masters-only, and presumably this will continue in the future.)

Also note that saying "customers cannot safely use host ports on masters" is an improvement from the current situation, which is that customers cannot safely use host ports on any node.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 9, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 28, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign juliakreger after the PR has been reviewed.
You can assign the PR to them by writing /assign @juliakreger in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dhellmann
Copy link
Contributor

@danwinship this looks like something we still want. Who should be included in the review?

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2021
@aravindhp
Copy link
Contributor

aravindhp commented Aug 23, 2021

@danwinship @dcbw for 4789 I see a note mentioning OpenShift SDN only. Doesn't it also effect OVN given we had to come up with custom hybrid VXLAN port option to work around Pod-to-pod connectivity between hosts is broken on my Kubernetes cluster running on vSphere?

@danwinship
Copy link
Contributor Author

Ah, yes, we do sometimes use VXLAN with OVN Kubernetes, but it's not vSphere-related, it's Windows-containers-related.

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 30d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 22, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 22, 2021
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 29, 2021
@openshift-ci openshift-ci bot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 29, 2021
@danwinship
Copy link
Contributor Author

/remove-lifecycle rotten

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 4, 2021
@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 1, 2021
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 8, 2021
@openshift-bot
Copy link

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

@openshift-ci openshift-ci bot closed this Nov 15, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 15, 2021

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants