Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peering: expose servers over K8s service #1371

Closed
wants to merge 24 commits into from

Conversation

ndhanushkodi
Copy link
Contributor

@ndhanushkodi ndhanushkodi commented Jul 26, 2022

Changes proposed in this PR:

  • Add new expose-servers service to Helm chart
  • Support scraping addresses from expose-servers service when type LB or Nodeport and using those to generate the token
  • Bump consul/api module to use generate token endpoint changes from Update generate token endpoint to take external addresses consul#13844
  • Update acceptance tests to deploy 3 servers, and on clouds it'll use a load balancer service, on kind itll use a nodeport service.

How I've tested this PR:

  • ran acceptance tests on Kind, GKE, EKS
  • also tested Nodeport manually on GKE with 3 nodes because we can only run with 1 server on Kind.

Note that the AKS tests are currently failing, but they have never been enabled for peering before so I do need to keep looking into them but this PR can be reviewed as is. The AKS failures were to do with port-forwarding to a local helm cluster, reaching the kube api, and leftover resources, nothing to indicate true peering failures, and both Nodeport and Loadbalancer scenarios have been tested on GKE and EKS.

https://app.circleci.com/pipelines/github/hashicorp/consul-k8s/6782/workflows/4460cc43-36f0-49b6-8ee3-966f9e3a2804 passed GKE, EKS, but somehow 2 commits later they are all are failing the same way now...I'm pretty sure its something minor that I need to catch.

How I expect reviewers to test this PR:
You could set up two clusters, deploy with peering enabled, and servers enabled which will cause the expose service to get deployed. Then you could apply a dialer and acceptor, and inspect the token, and see that it has the LB or nodeport addresses. You can also test a service dialing another service.

Checklist:

  • Tests added
  • CHANGELOG entry added

    HashiCorp engineers only, community PRs should not add a changelog entry.
    Entries should use present tense (e.g. Add support for...)

@ndhanushkodi ndhanushkodi marked this pull request as ready for review July 27, 2022 04:46
spec:
type: "{{ .Values.server.exposeService.type }}"
ports:
{{- if (or (not .Values.global.tls.enabled) (not .Values.global.tls.httpsOnly)) }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include port 8500? On EKS, load balancers will health check to the first port in the service. So previously, when only 8501 was here, and TLS was disabled, the load balancer would kick off all the servers as unhealthy endpoints. So unless we are always going to require TLS, we need to keep port 8500.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to add TLS helm tests based on a decision on this comment.

@@ -96,7 +96,7 @@ func CheckStaticServerConnectionMultipleFailureMessages(t *testing.T, options *k
expectedOutput = expectedSuccessOutput
}

retrier := &retry.Timer{Timeout: 80 * time.Second, Wait: 2 * time.Second}
retrier := &retry.Timer{Timeout: 160 * time.Second, Wait: 2 * time.Second}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to try this connection for a bit longer, because the peering connection needs to get set up and the exported service needs to make it over to the importing side, and the dialer is likely retrying a few times to reach the leader, so it makes sense that this takes a bit longer than it usually does.

@ndhanushkodi ndhanushkodi requested a review from ishustava July 27, 2022 05:43
…kind, and run all acceptance tests (just for peering)
@ndhanushkodi ndhanushkodi force-pushed the peering-external-servers branch from 5fe7a6e to c345dac Compare July 27, 2022 16:14
Copy link
Contributor

@thisisnotashwin thisisnotashwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some thoughts!! overall this looks excellent!!

}

releaseName := helpers.RandomName()

helpers.MergeMaps(staticServerPeerHelmValues, commonHelmValues)
helpers.MergeMaps(commonHelmValues, staticServerPeerHelmValues)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we swap the order here so that we deploy static server cluster and the static client cluster with staticServerPeerValues and staticClientPeerValues? This makes it confusing because both use common values but we should not update the common values between the 2 deploys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the replicas 3 for servers, and made it a non-kind specific config instead!

@@ -1831,6 +1831,57 @@ EOF
[[ "$output" =~ "setting global.peering.enabled to true requires connectInject.enabled to be true" ]]
}

@test "connectInject/Deployment: -poll-server-expose-service=true is set when global.peering.enabled is true" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing a test case or two here for the value of serverAddress.source being ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup i added this now! thanks for catching!

charts/consul/values.yaml Show resolved Hide resolved
@ndhanushkodi
Copy link
Contributor Author

Resolved these comments in the new PR: #1378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants