-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: revisit unconditional use of cgo lookups for darwin #16345
Comments
The Darwin exclusion predates https://golang.org/cl/8945 I thought. Can you do some digging and investigate its history? What do we gain by using Go's resolver by default on Mac? It seems like we'd need to do #12524 first as you mentioned otherwise we risk doing more harm than good. |
I did some digging! Before https://golang.org/cl/8945 the only way to use the Go's resolver (for most platforms, not just darwin) was to build with netgo (and/or without cgo?). This caused the stubs to return false for completed and triggered use of the Go resolver. I discovered along the way that android is also currently summarily excluded from using Go's resolver due to #10714, though it's done per request instead of per-conf-setup like darwin. In the CL for that, @minux called for the same to be done for iOS and then subsequently found the exclusion for darwin. All unix platforms except for darwin and android now use Go's resolver unless weirdness is detected along here or here. I think we could gain more consistency across platforms if darwin were to join in with necessary safety checks. Re #12524, that could be something we detect and fall back to cgo for initially. Once there is support for it in the Go resolver the check could be removed and Go's resolver could be used in that case. At the very least if we decide not to pursue this it would be good to update the comment around the darwin exclusion to explain it's due to more than possible firewall warnings. |
Have you tested the pure Go resolver on older OS X systems?
|
I only have immediate access to 10.11 and 10.10 systems, neither of which pop up anything when trying the test program in the description (which I've updated to show exactly how I ran it). Trying to see if any community members have access to older systems for testing. Worth noting that 10.10 would have been current at the time of https://golang.org/cl/8945. I'm also unable to find anything when searching for this issue happening generally. @bradfitz, did you experience this firsthand? Is it possible it was this standard firewall warning that pops up when a process listens for outside connections, and maybe caused by something else? |
Any further thoughts on this? I can try and put together ways to detect when we should not use Go's resolver if we want to move forward with removing the blanket condition. |
@dpiddy, I can test on OS X 10.8, now that we have VM-based builders and ancient versions available. |
Great! Let me know what I can do to help. |
I tried on OS X 10.8 with the firewall on in its most restrictive mode (block all incoming connections), and I saw no pop-up dialog using Go's DNS resolver. I set |
Propose away. |
Hope to spend some good time on this soon. So far existence of What I'm more unsure of is this bit in the darwin resolver(5) man page:
I'm not familiar with what those other data sources might be or if we can detect their use. Issues like docker/for-mac#19 suggest There might also be special names that should prefer cgo. Any available insight on these things would be greatly appreciated! And if it would help get things started to open a CL removing the blanket darwin exclusion but falling back to cgo if |
It looks like Chromium uses the unexported symbol |
Also, another link. Here is the implementation of http://publicsource.apple.com/source/configd/configd-802.40.13/dnsinfo/dnsinfo_copy.c |
Thanks for the tips!
Any pointers on what that might look like? Couple other notes from digging:
|
OS X's version of nsswitch.conf is the data in the SystemConfiguration framework. Look at the Chromium source code; also look at our Keychain code in crypto/x509. I imagine we'd do something similar where we weakly link the symbols and call them from cgo. Though an open question is whether it's actually worth the complexity to use the native resolvers for some but not all queries, I think. |
The crypto/x509 tip helped me get an idea for what would be involved, thanks.
If the result of this is deciding it's too complex to use Go's resolver when the native resolver (via cgo) is available, that's fine with me. At least we know and have info to consider for the future. There might still be room for improving the experience when the native resolver is unavailable, such as with support for /etc/resolver. |
On my OS X 10.11.6 system, if I go to Security & Privacy and then Firewall and Turn Firewall On and then Firewall Options... and then check "Block all incoming connections", then It is true that without "Block all incoming connections", the cgo mode does not get popup dialogs like it used to long ago. But I think we probably still can't turn on Go resolution by default. For the record, the original CL adding cgo support for resolving host names, specifically for OS X, was golang.org/cl/4437053 aka c9164a5. I agree it would be nice if we could do what we do on Linux etc where we look at resolv.conf and nsswitch.conf and decide if it's OK to use the pure Go resolver. The reason we look at nsswitch.conf is to see if there are any non-DNS lookup methods configured, and if so we delegate to the C library. On the Mac there is no nsswitch.conf but as I understand it there's effectively always a non-DNS lookup method configured (Bonjour). So if there were an accurate nsswitch.conf we'd never use the Go resolver by default. To summarize:
For both these reasons, I think we should leave the default on OS X where it is, namely using the cgo resolver. Note that people who want to use the pure Go resolver need not recompile their programs, as in @danp's example (the CGO_ENABLED=0 is causing package net to be rebuilt entirely). It suffices to set GODEBUG=netdns=go, as mentioned in the net package doc. |
That's only for I'm going to kick this to Unplanned for now. If somebody wants to own this and do the pure Go thing when it's really safe on macOS and won't be annoying for the user, feel free to research and post your plan. It could go in Go 1.9 if there's a plan that addresses @rsc's concerns. |
Just tried the test on my 10.12.3 system, with "Block all incoming connections" enabled, and it worked both with |
@danp same result for me |
VPN is a common use case where using the Go native resolver with current behavior would break many things. For example, an IKEv2 VPN which does not do full tunneling will not install the DNS server provided in the IKE configuration as the system default, and thus will not end up in |
Agreed with @bitglue. Anything that relies on |
Any reference for this? |
The comment at the head of the file:
And resolver(5):
|
@bitglue Neither of those things substantiate the claim that "Anything that relies on /etc/resolv.conf is broken on MacOS". |
@peterbourgon You're right -- the MacOS documentation doesn't say anywhere that Go is broken. It does however say pretty clearly that the builtin resolver doesn't use resolv.conf at all, and that it gets its configuration from other sources, like /etc/resolver and the System Configuration Database. We can then reasonably hypothesize that because Go uses a file for resolver configuration that MacOS does not use (resolv.conf), and Go does not use the configuration sources the MacOS resolver does use (/etc/resolver, System Configuration Database, maybe more), that the two may not behave identically. In other words, "broken" behavior. If you want specific scenarios where that happens, you already have them in this thread and the many other issues that have mentioned it. I don't know how it could be made more clear. |
True!
False! 😉 I agree that the Go resolver should change to follow macOS standard behaviors and that the current implementation produces broken results for many users. But the root cause of that broken behavior is not the Go resolver, it's broken configuration in resolv.conf. The nameservers in that file should work, and if they don't it's a problem with whatever wrote them there. But this is just splitting hairs. It's unlikely that we're gonna convince Tim to change anything here. Go should change instead. |
One of the problems I've seen is when using a VPN client that doesn't update |
This seems like a problem with that client; mine does, as another datapoint. |
That's the arguably part 😁 . While it's a nice to have, it's not clear that on a system where /etc/resolv.conf is documented as not being used for DNS resolution by most applications on a system that it's wrong to not update that file. |
You are assuming all possible configurations could somehow be represented in resolv.conf. They can't. For example, the MacOS resolver can be configured to send queries for a particular domain to an alternative server. This situation is frequently encountered by VPN users. How would you represent that in /etc/resolv.conf? MacOS does in fact do a pretty good job of keeping /etc/resolv.conf in sync as much as possible. But since not all configuration options can be represented within the constraints of that file, there's only so much it can do. Mostly it's not a problem, because the only things that ship with Mac OS which use this file are programs like The problem with the pure go resolver is it assumes making DNS queries and resolving hostnames are the same thing. This isn't really true:
The pure Go resolver captures some but not all of this behavior. This results in the pure Go resolver being some amount of "broken" not just on MacOS, but on Linux as well. Go's own documentation admits the many limitations of the pure Go resolver:
The reason this brokenness tends to be noticed on MacOS more than Linux because most binary releases of popular Go programs (kubectl and terraform are two I've used personally) are cross-compiled, which in practice involves disabling cgo. So, the fallback to the cgo resolver can't happen. Native-compiled Go programs on MacOS resolve fine because they use the cgo resolver unconditionally. A Linux go binary with cgo disabled is also broken, but disabling cgo on Linux isn't nearly as common. If this decision to unconditionally use the cgo resolver were changed, then Go would need to do a better job detecting situations where it would not behave as desired and falling back, as it does on Linux. As explained in the documentation quoted above, there are many things that might trigger these fallbacks, but many of these are lacking their MacOS counterparts. For example, MacOS doesn't have a /etc/nsswitch.conf. I believe the equivalent configuration is in the System Configuration Database, which isn't mentioned at all among the checks that might trigger a fallback to the cgo resolver. In other words, no longer unconditionally using the cgo resolver on Darwin would subject all Go users, not just those using cross-compiled binaries, to the brokenness already widely reported here and in mentioned other issues. |
The problem is in your expectation that /etc/resolv.conf is the canonical source of configuration. It says right in the file it's not. And for good reason: the MacOS resolver can do a lot of stuff which simply can't be configured in /etc/resolv.conf because the format of this file, which was set decades ago, simply doesn't support it. Things like only sending DNS queries for *.workstuff.example.com to the DNS server at 10.255.255.254 over the VPN tunnel that's a route only for 10.0.0.0/8. The built-in VPN client works this way. If you configure a VPN tunnel to route all traffic, then it does update /etc/resolv.conf (indirectly: what it does is also route all DNS queries to the VPN-provided DNS server through the System Configuration Database, which then updates /etc/resolv.conf in a best-effort attempt to maintain compatibility with a file syntax that predates the design of the MacOS resolver). If the VPN tunnel routes only some traffic then it can also provide a DNS server for some domains, but this doesn't end up in /etc/resolv.conf because there's no way to articulate the idea in that syntax. |
I do not expect that it is the canonical source of configuration. I expect that it is a working source of configuration, and I expect that anything which modifies my host's nameservers will also modify that file, as a secondary but still necessary action. But, again, this is moot, because we agree on what should happen: the pure Go DNS resolver on macOS should consult the actual source of truth instead of this (often broken) proxy. |
I think with #12524 fixed by having darwin always use cgo-less libc calls this can be closed. Thanks, all! |
https://golang.org/cl/8945 changed to using the native Go stub resolver for most systems. Darwin was excluded, here, due to firewall warnings.
Is this still an issue?
On my 10.11 system with the firewall enabled this test program doesn't produce any warnings, even with cgo disabled:
(nothing pops up when running)
There are probably other reasons to conditionally exclude Darwin for now, such as if /etc/resolver config is used (#12524), but perhaps removing this blanket condition could be a start.
cc @mdempsky and @bradfitz since you authored that change.
The text was updated successfully, but these errors were encountered: