-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agent will not start on machines without a private ip since it won't bind to any ip available. #725
Comments
After investigating some more i realize it's not about bind-addresses, but advertise-addresses which almost makes this a bit silly. Consul runs and listens on any ip-address by default, but it refuses to announce any address that isn't rfc1918 by default? :) I believe a proper solution is to add a "allow announce networks" option. By default it can continue to have the current behavior where it will only advertise rfc1918 addresses. If this option is set it should override the default privateBlocks variable in util.go with the configured range, so the default rfc1918 addresses should not be set if this option is configured. It would also be very nice if the -bind and the -client options could be specified multiple times, currently it only seems to bind to the last option specified on the commandline. Some thoughts i have about service design: For example: We may have rfc1918 addresses configured on some interfaces in our servers but in our case those are untrusted networks since they're connecting to something outside of our datacenter. |
@cetex Consul does not bind or advertise to a public IP by default for security reasons. You can always provide the As to multiple -client and -bind addresses, this is complicated significantly by the design of the gossip system, as it does not easily allow us to broadcast our availability on multiple addresses. |
I have to correct myself from my last message, consul won't advertise a public IP by default, but if there is an IP to advertise it will listen on *. The problem i have with the current policy is that if i specify -bind=:: (which should be the issue here if it's about security) Consul won't take any IP available and advertise it (as it does so nicely with RFC1918 ip's), it will still refuse to start even though it's allowed to listen on all IP's. And frankly I don't see what the problem with non-RFC1918 addresses is. What actual problem is this filter solving in datacenters besides being annoying? The current rules also has no regard at all for IPv6 environments since it assumes that things are running IPv4 only which should be seen as a relatively serious issue. (no IPv4 available? i'll quit), resulting in static being the only configuration-option for IPv6 environments. it would be a lot simpler if it could grab any assigned IP and use that by default. Some tests on a server with IPv4 (public IP + 127.0.0.1), and IPv6 (link-local + routeable IPv6 + ::1) with a few interesting things: Starting without -bind and -announce. As mentioned, i believe this should work out of the box for simplicity. Starting without -bind, -advertise set to 127.0.0.1 this is a bit unexpected with the RFC1918 rules currently in place. I expect consul to only bind to localhost and RFC1918 IP's with the current policy if -bind isn't specified. The last test got me curious, so let's add an RFC1918 IP to loopback and see what happens: It seems Consul is listening on all interfaces available anyways. If i didn't have iptables in place this consul server would be reachable over Internet. Quite unexpected with the current RFC1918 policy. Let's try to advertise a hostname instead. This would allow my configuration to be almost as simple as not specifying -advertise at all while still being "protective" by default. Starting with -bind=:: Starting with -bind=xx.xx.45.83 works, This requires a bit of scripting to implement since all slaves (a few hundred) are pxe-booted and use the same boot image. This doesn't work at all if i want consul to listen to more than one IP (for example for anycasting which we do) since i can't state multiple -client or -bind options. I would have to run netcat on the other IP's or something equally dirty (iptables dnat) to make it work which is quite unacceptable in a production environment. Starting with -bind=:: -advertise=2001:db8::1 works but with the same drawbacks as for the IPv4 advertising, Consul should instead grab the primary address of any non-loopback interface and announce it. |
@cetex Thank you for testing all of the combinations, there are definitely some bugs which you've helped to find with selecting the bind address in the various sub-systems. In terms of defaults, Consul will only bind to a private IPv4 address by default. The reason for this is that almost no deployments of Consul should be on the public internet, and many environments still aren't great for IPv6. For users that want to bind to a public address they must do so explicitly so that it is clear to them that the node will be on the public internet and that extra security measures should be taken. IPv6 is also not done automatically due to the number of environments where it does not work properly. I think this is a sane tradeoff. In the majority of cases, we can automatically bind to the private IPv4 address. In any other case, an explicit bind value can be provided to override the behavior. Eventually, it would be nice to support interface binding as well, but it is not a high priority ticket. |
@armon I just want to mention a use case that may not be entirely clear here: We at ungleich are running an infrastructure for customers that consists solely of machines with public ip addresses and safe firewall settings - they are all managed by configuration management (cdist in our case). At the moment we have to include the individual public ip address in the configuration, which kind of reverses the purpose of consul (which is able to discover things, but before discovering, we need to manually discover the node's ip address). In short: we do have a use case in which we want consul to bind and announce on any address. I would expect consul to do so by issuing -bind=0.0.0.0, however that results in the infamous error message also seen in #789. If your point of view is that this is insecure by default, then I suggest to add a flag to consul like -allow-insecure-bind or -allow-non-private-bind, but still allow to bind without specifying the public ip address. |
@telmich So in this case, is it not possible to just query the IP address of the device via ipconfig and provide that to Consul via This is why the I hope that helps! |
@armon I understand the problem in regard to consul now - I wasn't aware that the IP is not taken from the IP packet itself, but is contained as data. I will try to find a way around it by writing cdist types that query for the address of ethX. I wonder, would it be a smaller change to consul to support something like -bind-interface to select the ip to be used? Almost all of our machines are running on opennebula and thus the interface is eth0 in most cases, so being able to use -bind-interface eth0 would at least make the situation easier for the moment. I would however appreciate support for "no configuration required" in case of having only official ip addresses mid term - either way (multiple ip support, using the source ip in the ip packet, ...) |
The problem is with "--announce", consul refuses to announce anything that's not a rfc1918 address. I believe this gives a false sense of security (it's actually pretty close to "security by obscurity") since if you're aware that consul won't do anything with public ip's you expect that you could run it on a host with both public and private interfaces and that consul will only listen to the private interfaces, but this is not true. this "security-feature" in consul is only about announcing, so it won't pick a public ip to announce to it's peers, but it will still happily bind to any ip. so if i install consul on a host that has a public and a private ip, consul will start and announce the private ip to it's peers, but it will still bind to both the private and the public IP, and therefore will be accessible on the public internet unless i have some other protection. I see a few options as solutions: |
@cetex I agree in the regard that using rfc1918 ip addresses for security is flawed by design. I would personally favor fix no. 1, as it feels very strange in 2015 to rely on rfc1918 ip addresses (may have "felt" more sensible in ~1996). I would also suggest another solution that could make life very easy for developers of consul as of users:
That would fix probably 99% of the cases and if you are having multiple IP addresses on a host, you might be doing something "special" and require adjusting. |
@telmich I agree. that would be a very sane design since it removes any possibility that consul may select the wrong address to announce "automagically" I'm also a bit torn between keeping the ip filter or not, removing it completely would be the most sane thing to do, but keeping it around (changing the default filter to "::" so it's not filtering by default) and then making it configurable with an option like "-limit-announce-ip" which would override the default of "::" would simplify deployment in our case. For example: -bind=:: -limit-announce-ip=a.a.a.a/b -limit-announce-ip=c.c.c.c/d would make this work for us. But i guess i'd prefer software that doesn't care about what ip's are used at all so my "vote" is for complete removal of the feature. |
Why isn't this working?
|
What is the output of ipconfig? It just looks like that machine has no private ip. |
@johnjelinek that seems odd, specifying an explicit bind address should bypass the code that performs the private IP check. Is there anything special about the environment? What release of Consul are you using? |
Using 0.5.1 on Windows Server 2012 R2.
— On Wed, May 13, 2015 at 11:16 PM, Ryan Uber notifications@github.com
|
Right, |
But I should be and to bypass it with — On Thu, May 14, 2015 at 8:28 AM, Ryan Breen notifications@github.com
|
You can't bind to an IP that isn't present in ipconfig. There needs to be a network adapter receiving packets at that IP for Consul to work. I believe Windows has the ability for you to create a virtual network adapter sharing the same physical adapter, but keep in mind that all machines in your cluster will need to be on the same private network with non-overlapping IPs so that they can route packets to each other. |
My docker cluster is exposing on the public IPs without any issue. These are ephemeral servers that don't have a long life, so they should be coming in and out of the cluster pretty regularly. I'll try to add a virtual interface for windows to see if that helps. Didn't you say — On Thu, May 14, 2015 at 8:49 AM, Ryan Breen notifications@github.com
|
I'm curious why your ephemeral instances need public IPs. Feels like that's a case where private IPs load-balanced and exposed as a single public IP makes more sense. |
That's all my VPS provider provides. 1 public IP and no private interface. I got a little further by creating a virtual loopback nic in windows and setting a static IP on the interface to 10.0.0.1. Now consul keeps moving. It seems like there's no config setting to just bypass the private IP requirement. — On Thu, May 14, 2015 at 9:28 AM, Ryan Breen notifications@github.com
|
I also learned that the order of the flags matter: -bind public_ip -join cluster_IP For future reference, create a virtual loopback interface in windows and see it to a private address to bypass consul's check. Then — On Thu, May 14, 2015 at 9:28 AM, Ryan Breen notifications@github.com
|
Just to summarize the lengthy discussion: If you have a node without any private ip addresses you are - at the moment - not able to use consul on this node. There is no way around the requirement to advertise a private IPv4. This needs to be fixed. The private ip space restriction does not hold within the IPv6 world, so why is it present within the IPv4 space? |
@duritong Not quite, you are able to use Consul but you just have to specify the bind address, Consul will not automatically infer it. So you must provide a single configuration option, certainly not an undue burden. |
And you also need to create a loopback interface. The burden exists, but it's not significant. — On Wed, Jun 17, 2015 at 2:08 PM, Armon Dadgar notifications@github.com
|
The only reason you need the loopback is to bind the client ports, so technically you could use |
Right, I'm sorry. I was confused by the advertise discussion and a config-path error on my side let me think that you also need to advertise 127.0.0.1 which is kind of counterintertuitive based on what the documentation says about that option. Defining the Sorry for the noise. |
"By default, we will scan for a private IP and bind to that if available. Otherwise, we return an error. " So why not instead bind to whatever IP is found, rather than returning an error? Combine that with the suggestions made by @telmich and perhaps both "sides" can have their cake an eat it to. After all, if a host only has a single IP it is pretty obvious the operator is wanting Consul to bind to it, and pretty obvious it is expected to. Saying "we won't bind to a non-rfc1918 address even if there are none found is telling the user you're smarter than they are. Making them jump through hoops to do what daemons are expected to do should have a very high bar, and I don't see that bar as being met here. |
To make matters more frustrating is that it binds to the first available rfc1918 address, even though it was told to join to another subnet... Grumble grumble... |
@telmich - +1 for selecting default interface. Current setup makes it different to run consul automatically on virtual hosts - each one will have different IP, so one has to write a shell script, guess the IP, append to configuration... If we could select an interface - it all gets a whole load easier. I'd tell consul "advertise on interface eth0" and that's it. |
+1 for @TiS suggestion. You can't spawn cloud machines automatically, with docker installation without this feature. One have to write a shell script to bypass. |
I'll add my two cents on the "bind to interface" option. I'd support it and from a brief glance through the net package the needed bits to do it (net.Interfaces(), net.InterfaceByName(), interface.Addrs(), etc.) are there. In the case of an interface having multiple addresses, I'd propose the first one to be considered the default. |
@therealbill The downside with choosing the "first address" is that in the case of a docker network overlay, the first ip will likely be the wrong one. It would be better if it would bind to all ip's on a given interface, if you don't want it to bind to all of them then the ip should be specified. Perhaps something like:
|
@withinboredom I'd be ok with that. |
I've submitted a pull request which binds to a cidr regardless of interface #1570. This could be expanded to support interface. In the PR to ease parsing |
@beornf, individual addresses can end with |
get available ip list fail get github first setup ips :none |
While I think it' a bit odd to consider rfc1918-addresses as special, it seems to me that the the problem is even worse than that: we use internally routable networks in 10.0.0.0/8, and I get the same error on a machine configured just with localhost and an ip in 10.x.y.z/24, which of course is a valid rfc1918-address. This with Consul v0.6.4 on SmartOS. |
Why WebRTC does not work without internet connectivity? |
Is it plugged in and powered on? |
Not plugin. Just os windows with google chrome browser on board. No webrtc sample does not work without internet. It's a pitty. |
Hi, I am getting following error message: [root@VORA1 bin]# ./consul agent -server -data-dir=/var/local/vora-discovery Result of ifconfig: [root@VORA1 bin]# ifconfig lo Link encap:Local Loopback -bind also doesn't seem to resolve the issue: [root@VORA1 bin]# ./consul agent -server -data-dir=/var/local/vora-discovery --bind 10.78.1.240 [root@VORA1 bin]# ss -antp | grep -i consul Any pointers please... Thanks |
@jyoti-264 , your port is already in use. Try to provide another port. |
+1 for the ability to bind to an interface. I think the RFC1918 restriction is a bit unexpected. I'd auto-bind to the non-localhost (eth0) interface IP address if only one is available and force the user to set an address in cases where more than one is available. We use an overlay network for our containers and the subnet for this overlay network is 250.0.0.0/8 (marked for "future use" by IANA.) Probably not best practice, but the overlay requires a /8 and we're already using 172/8 for our VPCs and 10/8 for EC2 classic. The addresses in the overlay network are assigned from DHCP at the time containers are launched so scripting up a consul container to do a bind is...fun. Update: nevermind. I found that the Consul container entrypoint has a variable (CONSUL_BIND_INTERFACE) that's used for exactly this purpose and all is good now. :) |
All, please give the latest code in
Very few people should need to do anything as obscene as shown in the last example, but the functionality is there should you need it. With the
There is now a configurable template language for examples and docs) behind this that you can use to create a customizable heuristic that should allow you to get whatever it is that you need from your environment when using an immutable image (see hashicorp/go-sockaddr/template and cmd/sockaddr. Feedback welcome (preferably as a new issue, however). |
@sean- that looks awesome - will give it a try next week! |
How to support servers behind DHCP with changing IPs? E.G bind literally to 0.0.0.0? |
I just started with default parameters and all started fine. I was need to remove the |
The port-forward sometimes fails randomly
Consul agent won't start on our machines (that by default only have a public ip assigned, they are firewalled) since it won't bind to non-private ip's by default.
A commandline option to override the behaviour of "only bind to private ip's by default" would help a lot.
This option should change the current filters in consul for everything ip-related to allow any assigned ip to be used automatically.
The text was updated successfully, but these errors were encountered: