-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
peering: expose servers over K8s service #1371
Conversation
spec: | ||
type: "{{ .Values.server.exposeService.type }}" | ||
ports: | ||
{{- if (or (not .Values.global.tls.enabled) (not .Values.global.tls.httpsOnly)) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to include port 8500? On EKS, load balancers will health check to the first port in the service. So previously, when only 8501 was here, and TLS was disabled, the load balancer would kick off all the servers as unhealthy endpoints. So unless we are always going to require TLS, we need to keep port 8500.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to add TLS helm tests based on a decision on this comment.
@@ -96,7 +96,7 @@ func CheckStaticServerConnectionMultipleFailureMessages(t *testing.T, options *k | |||
expectedOutput = expectedSuccessOutput | |||
} | |||
|
|||
retrier := &retry.Timer{Timeout: 80 * time.Second, Wait: 2 * time.Second} | |||
retrier := &retry.Timer{Timeout: 160 * time.Second, Wait: 2 * time.Second} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to try this connection for a bit longer, because the peering connection needs to get set up and the exported service needs to make it over to the importing side, and the dialer is likely retrying a few times to reach the leader, so it makes sense that this takes a bit longer than it usually does.
… to try internal ip?
…kind, and run all acceptance tests (just for peering)
5fe7a6e
to
c345dac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some thoughts!! overall this looks excellent!!
} | ||
|
||
releaseName := helpers.RandomName() | ||
|
||
helpers.MergeMaps(staticServerPeerHelmValues, commonHelmValues) | ||
helpers.MergeMaps(commonHelmValues, staticServerPeerHelmValues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we swap the order here so that we deploy static server cluster and the static client cluster with staticServerPeerValues
and staticClientPeerValues
? This makes it confusing because both use common values but we should not update the common values between the 2 deploys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the replicas 3 for servers, and made it a non-kind specific config instead!
@@ -1831,6 +1831,57 @@ EOF | |||
[[ "$output" =~ "setting global.peering.enabled to true requires connectInject.enabled to be true" ]] | |||
} | |||
|
|||
@test "connectInject/Deployment: -poll-server-expose-service=true is set when global.peering.enabled is true" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing a test case or two here for the value of serverAddress.source
being ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup i added this now! thanks for catching!
Resolved these comments in the new PR: #1378 |
Changes proposed in this PR:
How I've tested this PR:
Note that the AKS tests are currently failing, but they have never been enabled for peering before so I do need to keep looking into them but this PR can be reviewed as is. The AKS failures were to do with port-forwarding to a local helm cluster, reaching the kube api, and leftover resources, nothing to indicate true peering failures, and both Nodeport and Loadbalancer scenarios have been tested on GKE and EKS.
https://app.circleci.com/pipelines/github/hashicorp/consul-k8s/6782/workflows/4460cc43-36f0-49b6-8ee3-966f9e3a2804 passed GKE, EKS, but somehow 2 commits later they are all are failing the same way now...I'm pretty sure its something minor that I need to catch.
How I expect reviewers to test this PR:
You could set up two clusters, deploy with peering enabled, and servers enabled which will cause the expose service to get deployed. Then you could apply a dialer and acceptor, and inspect the token, and see that it has the LB or nodeport addresses. You can also test a service dialing another service.
Checklist: