Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't try to bind on address from inactive interface #948

Merged
merged 3 commits into from
Jan 15, 2016
Merged

Conversation

tiwilliam
Copy link
Contributor

Build our own network address list, we now make sure the interface is up and supports both broadcast and multicast. Resolves #934.

@tiwilliam
Copy link
Contributor Author

Will probably resolve #928 as well.

@tiwilliam
Copy link
Contributor Author

We seem to rely on resolving loopback (or an interface without multicast or broadcast) on Tarvis, maybe we want to bind to loopback if nothing else can be found.

return nil, fmt.Errorf("Failed to get interfaces: %v", err)
}

reqFlags := net.FlagUp | net.FlagBroadcast | net.FlagMulticast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually require Broadcast or Multicast support. Any reason to check for it? Also how does this method affect Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasoning for filtering on broadcast and multicast was to filter out loopback and other dummy interfaces, could change that to be a explicit filter on FlagLoopback instead.

Windows support all these flags, see https://golang.org/src/net/interface_windows.go.

@tiwilliam
Copy link
Contributor Author

Alright, we will now only require interfaces to be up and use loopback if nothing else can be found.

@armon
Copy link
Member

armon commented May 19, 2015

@tiwilliam My only concern now is Windows. I have a suspicion it will not play nice...

@tiwilliam
Copy link
Contributor Author

Alright, let me find some testing units and play around.

@tiwilliam
Copy link
Contributor Author

Someone with easy access to Windows, feel free to test this out. I've not gathered energy enough for installing Windows in my VM.

@highlyunavailable
Copy link
Contributor

Works fine for me on Windows:

==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> WARNING: Windows is not recommended as a Consul server. Do not use in production.
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
         Node name: 'MY-PC'
        Datacenter: 'dc1'
            Server: true (bootstrap: true)
       Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
      Cluster Addr: 10.0.240.1 (LAN: 8301, WAN: 8302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas: <disabled>

==> Log data will now stream in as it occurs:

    2015/06/29 14:09:00 [INFO] serf: EventMemberJoin: MY-PC 10.0.240.1

This was with 2 virtualbox adapters (enabled) and my LAN adapter disabled.

@highlyunavailable
Copy link
Contributor

I get this error on Windows if I try to bind with all 3 adapters disabled:

==> Starting Consul agent...
==> Error starting agent: Failed to get advertise address: Failed to get interface addresses: Failed to get interfaces: GetAdaptersInfo: The pipe is being closed.

That makes sense though - there are NO interfaces at that point, and there is not a dedicated "loopback interface" in Windows so this makes sense.

@tiwilliam
Copy link
Contributor Author

@highlyunavailable Thank you! I wonder if it works with the Windows Loopback Adapter installed.

@highlyunavailable
Copy link
Contributor

No, it fails with ==> Error starting agent: Failed to get advertise address: No private IP address found.

However, this is because privateBlocks does not contain 169.254.0.0/16, and that's what the loopback adapter binds to since it can't find another IP. Interestingly, that file also does not contain 127.0.0.0/8, so you might run into a problem there.

@tiwilliam
Copy link
Contributor Author

@highlyunavailable Good catch, I've added 127.0.0.0/8 and we now treat it as a private block, which I think in this case is perfectly fine.

I guess you can change address of your Windows Loopback Adapter manually to 127.0.0.1 if you really want that?

@highlyunavailable
Copy link
Contributor

You can't, because 127.0.0.1 would get added as part of the adapter. I strongly suggest adding 169.254.0.0/16 anyway, since it is defined as a private range in RFC 3927.

@tiwilliam
Copy link
Contributor Author

Alright, makes sense. My only concern would be that we now might end up using a non-working DHCP interface if that's first in the interface list.

@highlyunavailable
Copy link
Contributor

True. I'd say if you're going to support 169.254 then it should be the absolute 100% last thing to bind to - all other non-loopback adapters should be tried first.

@tiwilliam
Copy link
Contributor Author

I'm happy with current state of this now when it's tested on Windows. A final sign-off before merge would be appreciated.

@tiwilliam tiwilliam force-pushed the iface-down-fix branch 2 times, most recently from 10f5f53 to adf8d07 Compare September 2, 2015 09:43
@highlyunavailable
Copy link
Contributor

It doesn't look like the 169.254 case is a problem since Consul will bail out if it is presented with multiple private IPs.

This could only be a problem if the machine a. has 1 APIPA-addressed DHCP adapter b. public IP adapter(s) - this would mean that consul would try to auto-bind to the APIPA adapter, but that's already a thing - Consul won't bind to a public IP. I'm fine with how it works in Windows right now as well.

if @armon approves it would be cool to get this merged - there are multiple open bugs that this would fix it looks like: #934, #928.

@slackpad
Copy link
Contributor

Sorry for the delay on this one - looks good!

slackpad added a commit that referenced this pull request Jan 15, 2016
Don't try to bind on address from inactive interface
@slackpad slackpad merged commit 94d3f88 into master Jan 15, 2016
@slackpad slackpad deleted the iface-down-fix branch January 15, 2016 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consul is trying to bind to an interface that isn't up
4 participants