Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] Don't allow internal Services to share an LB #2190

Merged
merged 1 commit into from
Apr 12, 2023

Conversation

dulek
Copy link
Contributor

@dulek dulek commented Apr 7, 2023

What this PR does / why we need it:
Feature of sharing LBs seems to have some logic flaws. One of them is the problem of mixing internal and external Services on a single LB. As a FIP is tied to the LB and not individual listener, it means that if one Service attached the FIP to the Service, all the other Services will be available on that FIP. This may lead to accidental exposure of an internal Service which can potentially be pretty bad.

This commits attempts to limit the number of cases when the user can shoot themselves in the foot and makes it impossible for the internal Services to be secondary on a share load balancer. Moreover a condition is added that prevents secondary Services to create FIPs at all, so that accidental exposure of primary internal service by a secondary external one is solved.

There is no other way to reliably do that prevention as we only save truncated name into the tags of the LB, so there's no way to get list of all Services in a load balancer without listing all of them (in every namespace) and looking through their LB ID annotation. Even with that it's still really complicated to make decisions.

Cases (P - primary, S - secondary, E - external, I - internal):

P S
E E - all good
E I - prevented, all good
I I - prevented, not great, because it's technically a working
      combination, but it's a cost
I E - external one won't create FIP, good

Updates (after #2168 merges):
E E -> E I - we'll get error on the second Service, good
E E -> I E - FIP gets detached, E won't be able to readd it, good.

Which issue this PR fixes(if applicable):
fixes #2174

Special notes for reviewers:
Cases are listed

Release note:

[openstack-cloud-controller-manager] OCCM will no longer allow internal Services to share a load balancer with another Service.

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 7, 2023
@k8s-ci-robot k8s-ci-robot requested a review from anguslees April 7, 2023 17:51
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 7, 2023
@k8s-ci-robot k8s-ci-robot requested a review from FengyunPan2 April 7, 2023 17:51
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 7, 2023
Feature of sharing LBs seems to have some logic flaws. One of them is
the problem of mixing internal and external Services on a single LB. As
a FIP is tied to the LB and not individual listener, it means that if
one Service attached the FIP to the Service, all the other Services will
be available on that FIP. This may lead to accidental exposure of an
internal Service which can potentially be pretty bad.

This commits attempts to limit the number of cases when the user can
shoot themselves in the foot and makes it impossible for the internal
Services to be secondary on a share load balancer. Moreover a condition
is added that prevents secondary Services to create FIPs at all, so that
accidental exposure of primary internal service by a secondary external
one is solved.

There is no other way to reliably do that prevention as we only save
truncated name into the tags of the LB, so there's no way to get list of
all Services in a load balancer without listing all of them (in every
namespace) and looking through their LB ID annotation. Even with that
it's still really complicated to make decisions.

Cases (P - primary, S - secondary, E - external, I - internal):

P S
E E - all good
E I - prevented, all good
I I - prevented, not great, because it's technically a working
      combination, but it's a cost
I E - external one won't create FIP, good

Updates (after kubernetes#2168 merges):
E E -> E I - we'll get error on the second Service, good
E E -> I E - FIP gets detached, E won't be able to readd it, good.
@dulek dulek force-pushed the no-mixed-shared branch from e69bb9a to 5a75e04 Compare April 7, 2023 18:19
@@ -928,6 +929,13 @@ func (lbaas *LbaasV2) getServiceAddress(clusterName string, service *corev1.Serv
}
klog.V(4).Infof("Found floating ip %v by loadbalancer port id %q", floatIP, portID)

// we cannot add a FIP to a shared LB when we're a secondary Service or we risk adding it to an internal
// Service and exposing it to the world unintentionally.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding it to an internal ===> curious line 917 we return directly if it's internal.. so will it possible that described scenario can happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the case when we create an internal Service A as an LB owner and then a secondary external Service B on the same LB. B is external, so normally it would attempt to create the FIP for the LB, exposing A externally in the process. The idea is to prevent secondary Services from creating FIPs.

It has to be done this way, because when we're a secondary Service, there is no reliable way to get the primary Service besides listing all the Services and looking for LB ID in the annotations. The LB name and tags are not reliable because they're truncated to 255 characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for the detailed info

@jichenjc
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jichenjc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 11, 2023
@jichenjc
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 12, 2023
@k8s-ci-robot k8s-ci-robot merged commit 788ea08 into kubernetes:master Apr 12, 2023
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Apr 13, 2023
I'm clarifying that 2.5 is and Octavia *API* version, adding a note
about FIP being shared too and explaining changes from kubernetes#2190.
k8s-ci-robot pushed a commit that referenced this pull request Apr 17, 2023
I'm clarifying that 2.5 is and Octavia *API* version, adding a note
about FIP being shared too and explaining changes from #2190.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 7, 2023
PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 8, 2023
PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 8, 2023
PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 11, 2023
PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.
k8s-ci-robot pushed a commit that referenced this pull request Sep 15, 2023
…ling (#2360)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR #2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 19, 2023
…ling (kubernetes#2360)

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
dulek added a commit to dulek/cloud-provider-openstack that referenced this pull request Sep 26, 2023
…ling (kubernetes#2360)

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
k8s-ci-robot pushed a commit that referenced this pull request Oct 13, 2023
…ling (#2360) (#2367)

* Fix shared LBs tests

PR #2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
k8s-infra-cherrypick-robot pushed a commit to k8s-infra-cherrypick-robot/cloud-provider-openstack that referenced this pull request Oct 13, 2023
…ubernetes#2360)

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
k8s-ci-robot pushed a commit that referenced this pull request Oct 16, 2023
…2360) (#2430)

* Fix shared LBs tests

PR #2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.

Co-authored-by: Michal Dulko <mdulko@redhat.com>
mandre pushed a commit to shiftstack/cloud-provider-openstack that referenced this pull request Feb 6, 2024
…ling (kubernetes#2360)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
dulek added a commit to shiftstack/cloud-provider-openstack that referenced this pull request Feb 7, 2024
…ling (kubernetes#2360)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
mandre pushed a commit to shiftstack/cloud-provider-openstack that referenced this pull request Feb 7, 2024
…ling (kubernetes#2360)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
dulek added a commit to shiftstack/cloud-provider-openstack that referenced this pull request Feb 7, 2024
…ling (kubernetes#2360)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR kubernetes#2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
k8s-ci-robot pushed a commit that referenced this pull request Feb 8, 2024
…ling (#2360) (#2537)

* Compare proper LB name for shared LBs

With shared LBs we distinguish the elements by tagging them with the
proper name of the LB that would be created for a Service if it wasn't
created as shared. This commit fixes that comparison for listener
deletion as code was always comparing the name of the primary LB.

* Fix shared LBs tests

PR #2190 prohibited sharing an LB that is internal for security reasons.
This commit fixes the shared LBs tests to not create internal LBs.

* Make sure we don't mask LB tests failures

In `test-lb-service.sh` we do `trap "delete_resources" EXIT` to make
sure we cleanup resources on a test failure. In there, we only fetched
the `$?` after making a check for `${AUTO_CLEAN_UP}`, which itself
alters the code to 0, so function always returns success. This means
tests can never really fail.

This commit fixes it by making sure `$ERROR_CODE` is fetched at the very
beginning of the cleanup function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[occm] Internal Service might be exposed when LB is shared
3 participants