Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need CGO to not be disabled in order for netcgo to work properly. #13878

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ncabatoff
Copy link
Collaborator

Fixes #12012.

@ncabatoff ncabatoff added this to the 1.9.4 milestone Feb 2, 2022
@ncabatoff ncabatoff requested a review from modrake February 2, 2022 19:20
@@ -208,7 +208,6 @@ jobs:
GOOS: ${{ matrix.goos }}
GOARCH: ${{ matrix.goarch }}
GO_TAGS: "${{ env.GO_TAGS }} netcgo"
CGO_ENABLED: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Injecting myself where I have no business, but if we need CGO for Darwin builds, should we not be explicit about it? Sorry if I completely missed the context...

CGO_ENABLED: 1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that CGO_ENABLED is a tristate, and unset is different from 0 or 1. I believe unset CGO_ENABLED is the minimal change necessary to make this work. But it's entirely possible that I'm missing something here and that the situation is simpler (or more complex) than I believe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When building with CGO_ENABLED missing from the environment, adding netcgo to GO_TAGS does not enable proper DNS resolution on macOS.

Test:

% GO_TAGS=netcgo make dev
% vault login -method=oidc
Error authenticating: Put "https://vault.example.internal/v1/auth/oidc/oidc/auth_url": dial tcp: lookup vault.example.internal on 172.16.108.111:53: no such host

example.internal is configured in macOS to be sent to a DNS server at 10.10.40.41 but is still using my non VPN dns server.

Next test:

% CGO_ENABLED=1 GO_TAGS=netcgo make dev
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 372120
-rwxr-xr-x  1 c5309377  staff   182M Feb  2 17:04 vault
% vault login -method=oidc
Complete the login via your OIDC provider. Launching browser to:

    https://sso.example.internal/auth/realms/test/protocol/openid-connect/auth?client_id=vault&nonce=fdd8a2f553cd61bf485507d07767d5556dc53ac1&redirect_uri=http%3A%2F%2Flocalhost%3A8250%2Foidc%2Fcallback&response_type=code&scope=openid&state=d783d93ffd93dd3b385350f55610ad20a2e7b458


Waiting for OIDC authentication to complete...
Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.

Key                  Value
---                  -----
token                s.POgDQgfP85Hld0GWC343Ws18
token_accessor       ACeIQhKo2WLiqESysljCeFdS
token_duration       24h
token_renewable      true
token_policies       ["default"]
identity_policies    []
policies             ["default"]
token_meta_role      default
% vault --version
Vault v1.10.0-dev ('2999fb8865772a40e389438a5fc21e0e98876dcd') (cgo)

@ncabatoff
Copy link
Collaborator Author

@archoversight I haven't had time to work on this lately, but I will come back to it soon. In the meantime, a colleague pointed out that your test with GO_TAGS=netcgo make dev was invalid because the dev make target uses BUILD_TAGS rather than GO_TAGS. So we haven't yet entirely settled the question as to whether CGO_ENABLED must be set to 1.

Really I need to make time to figure out how to test this locally. If you have any tips that don't require setting up openvpn I'm interested.

Copy link

@smacfarlane smacfarlane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncabatoff My understanding of CGO_ENABLED matches yours and we want to leave it unset for MacOS builds.

In the context of this particular issue, what was observed(1) was that setting CGO_ENABLED=1 on MacOS forces go into an external linker mode (2), and will try to use ld64 from the Apple Developer Tools. This became an issue when targeting ARM. Since we don't have MacOS ARM build hosts, we still have to cross compile those binaries and getting go to use the correct version of the developer tools was not easy (I forget the details now 😞), but we don't actually need the system linker. The go linker is perfectly capable of linking darwin/arm from darwin/amd64. We only care that it is able to load system libraries so that it can use native dns resolution at runtime, which is the behavior with CGO_ENABLED=1 or unset.

  1. I was unable to find a definitive statement on this observed behavior
  2. I was unable to find the right flag/combination of flags to force to use an internal linker when CGO_ENABLED=1. Passing the linker -linkmode internal was ignored and it would still try to use the system linker.

As far as -tags=netcgo, that is the default behavior on MacOS when CGO_ENABLED=1 or is unset. We're choosing to make it explicit here.

@archoversight
Copy link

@ncabatoff Alright, so based upon your statement I tried compiling with no CGO_ENABLED=1 set in the environment, and setting BUILD_TAGS to netcgo:

% BUILD_TAGS=netcgo make dev
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
go: downloading github.com/hashicorp/consul/api v1.12.0
go: downloading github.com/hashicorp/vault-plugin-auth-kubernetes v0.7.1-0.20220207145307-c9fa6acdfe0e
go: downloading github.com/hashicorp/vault-plugin-secrets-azure v0.11.3
go: downloading github.com/hashicorp/vault-plugin-secrets-gcp v0.11.2
go: downloading github.com/denisenkom/go-mssqldb v0.12.0
go: downloading github.com/hashicorp/serf v0.9.6
go: downloading github.com/golang-sql/sqlexp v0.0.0-20170517235910-f1bb20e5a188
go: downloading github.com/hashicorp/mdns v1.0.4
go: downloading github.com/miekg/dns v1.1.41
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 352784
-rwxr-xr-x  1 c5309377  staff   172M Feb 10 12:08 vault

Attempting to use it against an instance that is across split DNS:

% vault login -method=oidc
Error authenticating: Put "https://vault.example.internal/v1/auth/oidc/oidc/auth_url": dial tcp: lookup vault.example.internal on 172.16.108.111:53: no such host
% vault --version
Vault v1.10.0-dev ('dce88412d5815f235664e3cf2fa2c4a686023e14')

Running otool against it:

% otool -L ~/go/bin/vault
/Users/c5309377/go/bin/vault:
	/usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 0.0.0, current version 0.0.0)

Recompiling with CGO_ENABLED=1:

% CGO_ENABLED=1 BUILD_TAGS=netcgo make dev
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 372424
-rwxr-xr-x  1 c5309377  staff   182M Feb 10 12:14 vault
% vault login -method=oidc
Complete the login via your OIDC provider. Launching browser to:

[...]

success
% otool -L ~/go/bin/vault
/Users/c5309377/go/bin/vault:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1856.105.0)
	/System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 60157.60.19)
% vault --version
Vault v1.10.0-dev ('dce88412d5815f235664e3cf2fa2c4a686023e14') (cgo)

No matter what flags I set, or how I set them, I have to enable CGO_ENABLED=1 to get Go to link against IOKit.framework and to use the macOS system DNS resolution rather than the built-in Go one which does not understand macOS's split DNS infrastructure.

That is whether I use make dev or just plain old make:

% make GO_TAGS=netcgo CGO_ENABLED=1
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 372416
-rwxr-xr-x  1 c5309377  staff   182M Feb 10 12:20 vault
% otool -L ~/go/bin/vault
/Users/c5309377/go/bin/vault:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1856.105.0)
	/System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 60157.60.19)
C02CG59EMD6M ~/Projects/3rdparty/vault % vault --version
Vault v1.10.0-dev ('dce88412d5815f235664e3cf2fa2c4a686023e14') (cgo)

Broken:

% make dev BUILD_TAGS=netcgo
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 352784
-rwxr-xr-x  1 c5309377  staff   172M Feb 10 12:18 vault
% otool -L ~/go/bin/vault
/Users/c5309377/go/bin/vault:
	/usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 0.0.0, current version 0.0.0)
% make GO_TAGS=netcgo
==> Checking that build is using go version >= 1.17.5...
==> Using go version 1.17.6...
==> Removing old directory...
==> Building...
Number of parallel builds: 11

-->    darwin/amd64: github.com/hashicorp/vault

==> Results:
total 352784
-rwxr-xr-x  1 c5309377  staff   172M Feb 10 12:17 vault
% otool -L ~/go/bin/vault
/Users/c5309377/go/bin/vault:
	/usr/lib/libSystem.B.dylib (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 0.0.0, current version 0.0.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 0.0.0, current version 0.0.0)
% vault --version
Vault v1.10.0-dev ('dce88412d5815f235664e3cf2fa2c4a686023e14')

@archoversight
Copy link

In the context of this particular issue, what was observed(1) was that setting CGO_ENABLED=1 on MacOS forces go into an external linker mode (2), and will try to use ld64 from the Apple Developer Tools. This became an issue when targeting ARM. Since we don't have MacOS ARM build hosts, we still have to cross compile those binaries and getting go to use the correct version of the developer tools was not easy (I forget the details now 😞), but we don't actually need the system linker. The go linker is perfectly capable of linking darwin/arm from darwin/amd64. We only care that it is able to load system libraries so that it can use native dns resolution at runtime, which is the behavior with CGO_ENABLED=1 or unset.

Do note that terraform fixed this by building on macOS, and since the toolchain contains the headers/information for both x86_64 and arm64 they were able to build both:

hashicorp/terraform#30300

Could the same technique not be used here for vault?

@ncabatoff ncabatoff modified the milestones: 1.10, 1.10.1 Mar 16, 2022
@ncabatoff ncabatoff modified the milestones: 1.10.1, 1.10.2 Apr 14, 2022
@mladlow mladlow modified the milestones: 1.10.2, 1.10.3 Apr 29, 2022
@ldilalla-HC ldilalla-HC modified the milestones: 1.10.3, 1.10.4 May 12, 2022
@ncabatoff ncabatoff modified the milestones: 1.10.4, 1.10.5 Jun 7, 2022
@ncabatoff ncabatoff removed this from the 1.10.5 milestone Jul 18, 2022
@archoversight
Copy link

Is there anyway to get this work prioritized/completed so that binaries on macOS are able to resolve using the system resolver and thus can use split DNS?

@maxb
Copy link
Contributor

maxb commented Feb 11, 2023

I wonder if, as part of finalizing this journey, it might make sense to revisit whether to continue setting the netcgo build tag, as well?

According to the big comment in https://go.googlesource.com/go/+/go1.20/src/net/net.go, Go has good reasons to prefer its native DNS implementation when it can, and makes efforts to automatically transition to the cgo resolver if needed... and in particular, it already automatically defaults to cgo DNS resolution when running on darwin anyway!

    // Darwin pops up annoying dialog boxes if programs try to do
    // their own DNS requests. So always use cgo instead, which
    // avoids that.
    if runtime.GOOS == "darwin" || runtime.GOOS == "ios" {
        confVal.forceCgoLookupHost = true
        return
    }

(from Go src/net/conf.go)

@VioletHynes VioletHynes added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS resolution issues when connected to OpenVPN
8 participants