-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple disparate consul clusters somehow discovering each other #1833
Comments
Hi @Amit-PivotalLabs it sounds like they've been WAN joined at some point - https://www.consul.io/docs/guides/datacenters.html. Does |
Hi @slackpad Thanks for the fast response. They may have been WAN joined at some point, I can double-check that. Do you think this behaviour should only be seen if they have been WAN joined at some point, otherwise this would be unexpected, yes? Yes, DC1Consul Server 1
Consul Server 2
Consul Server 3
DC2Consul Server 1
Consul Server 2
Consul Server 3
DC3Consul Server 1
Consul Server 2
Consul Server 3
All the VMs are within the same VPC, all part of the same security group that allows all TCP and UDP traffic within the security group. Looks like they also had some Network ACLs set up possibly preventing traffic between clusters. I've deleted the ACLs and restarted the consul processes on all the consul servers. The I'm now also trying the following
This is just a standard AWS VPC. What would need to be done to make this less flaky? |
Hi @Amit-PivotalLabs correct - they have to be WAN joined otherwise the different datacenters wouldn't know about each other. If you'd like this to not be flaky the best thing would be to go to one of your Consul servers and |
Thanks @slackpad Atlas won't work for us as this needs to work in airtight on-prem networks. Could you clarify your advice about reducing flakiness? Does it suffice to have one server in one DC If some consul servers are configured to |
Using |
Closing this out as I don't think there's forward work left here. Consul 0.8.0 added WAN "join flooding":
|
I have 3 deployments within a single VPC, but the three deployments shouldn't know about one another. Each deployment has 5 VMs: 3 consul servers + 2 foobar servers with colocated consul agents. The first deployment uses "dc1" as its datacenter, so from nodes in the first deployment I can
nslookup
the following and get the expected results:foobar-1.foobar.service.cf.internal
(resolves to the IP of one of the foobar servers in dc1)foobar-1.foobar.service.dc1.cf.internal
(resolves to the IP of the same foobar server in dc1)foobar-1.node.cf.internal
(resolves to the IP of the same foobar server in dc1)foobar-1.node.dc1.cf.internal
(resolves to the IP of the same foobar server in dc1)foobar-2.foobar.service.cf.internal
(resolves to the IP of the other foobar server in dc1)foobar-2.foobar.service.dc1.cf.internal
(resolves to the IP of the other foobar server in dc1)foobar-2.node.cf.internal
(resolves to the IP of the other foobar server in dc1)foobar-2.node.dc1.cf.internal
(resolves to the IP of the other foobar server in dc1)foobar.service.cf.internal
(resolves to the two IPs of the foobar servers in dc1)foobar.service.dc1.cf.internal
(resolves to the two IPs of the foobar servers in dc1)Similarly, from nodes in dc3, I can make the same queries with "dc3" replacing "dc1" in all the statements above.
None of the nodes in dc1 should know about the nodes in the other dc's, however if from a node in dc1 I make the above queries with "dc3" replacing "dc1", I get successful results as though I were on a node in dc3. What's weirder is that this is not symmetric. From dc3, I can't query dc1, which is actually the behaviour I would expect.
It's also not entirely consistent. At one point I was able to query dc1 from dc3. I can include any start commands, info logs, nslookup queries, curl queries, etc. that would help explain what's going on.
Thanks,
Amit
The text was updated successfully, but these errors were encountered: