-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[agent] Serve API over Unix domain sockets #16884
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't align directly with Consul or Vault, but I think that's ok:
- Consul makes you choose between TCP or Unix, and I don't think we want to do that. They also have a distinct
unix_socket{}
block anyway, so we're not far off. - We match Vault in all the ways that matter. Vault just has that neat
listener{}
block to reuse which we lack.api_listener{}
blocks probably would have been a better choice for Nomad thanaddresses{}
, but that ship sailed ages ago.
} | ||
|
||
const ( | ||
MaxSocketPathLength = 104 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This varies depending on OS. So I think we should just try to create the socket and return the awful OS error if we fail.
If Go can find the sun_path
length for the OS its on, that would be amazing. I think even then we need to be careful to subtract 1 to leave space for a null terminator: otherwise the man page implies no one may be able to connect to it?! What a mess. https://man7.org/linux/man-pages/man7/unix.7.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like syscall exposes it: https://pkg.go.dev/syscall#RawSockaddrUnix
Still might need to subtract 1 to leave room for a null terminator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh actually maybe you already peeked because it looks like 104 is the floor!
Peeking at what Go sets across platforms:
go/src/syscall$ rg -A3 RawSockaddrUnix | rg Path.*int | awk '{print $3}' | sort -u
[1023]uint8
[104]int8
[108]int8
[UNIX_PATH_MAX]int8
Somehow AIX manages to be the nice one here with 1023. What a world. UNIX_PATH_MAX
is Windows only and 108.
Although I'm not sure if that takes the null terminator weirdness into account? So maybe we should set this to 103?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd peeked, but I think you're right about the null terminator. I think we can go for something more OS-specific by changing that line to:
MaxSocketPathLength = len(syscall.RawSockaddrUnix{}.Path) - 1 // remove 1 byte for null terminator
@@ -193,6 +194,58 @@ func NewHTTPServers(agent *Agent, config *Config) ([]*HTTPServer, error) { | |||
srvs = append(srvs, srv) | |||
} | |||
|
|||
// Initialize the API socket if configured | |||
startAPISocket := func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a top level func that takes a config and returns an error just because this func is already so huge and only readable because it's the same code over and over with slight variations. nbd though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll drop a TODO here so that I can come back to is but it doesn't stall the world.
@@ -92,6 +92,10 @@ type Config struct { | |||
// Use normalizedAddrs if you need the host+port to bind to. | |||
Addresses *Addresses `hcl:"addresses"` | |||
|
|||
// APISocket is used to configure a unix domain socket listener for the | |||
// HTTP API | |||
APISocket *SocketConfig `hcl:"api_socket"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pedant in me wants to name this api_unix_socket
because it's the "unix" bit that makes this interesting. All addresses{}
are all sockets. Using unix
in the name would match Consul and Vault better as well.
That being said if you don't think anyone wants to type that many characters I will shutup. It is nbd. 😅
flags.StringVar(&cmdConfig.APISocket.Path, "api-socket-path", "", "") | ||
flags.StringVar(&cmdConfig.APISocket.User, "api-socket-user", "", "") | ||
flags.StringVar(&cmdConfig.APISocket.Group, "api-socket-group", "", "") | ||
flags.StringVar(&cmdConfig.APISocket.Mode, "api-socket-mode", "", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think api-unix-*
makes more sense here (see below for discussion).
User string `hcl:"user"` | ||
Mode string `hcl:"mode"` | ||
Group string `hcl:"group"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Letting the administrator set user/group/mode is really interesting, in that I'd have expected us to use Nomad's own user/group and a restrictive mode. But this lets the administrator run a client as root and use a non-root user to talk to it (so long as you've got ACLs in place, I suppose).
This should definitely come with some notes about recommended usage in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some prior art:
- Vault has the bare minimum docs here.
- Consul is probably a good one to copy (except for the lack of Windows support... I wonder if it just-started-working for them when Go+Windows added support).
- Boundary is clearly an iteration on Vault's approach, and I don't think has anything relevant for us. It's interesting they appear to enable tls by default while Vault totally lacks tls for their unix socket. I think we can follow Vault here until somebody gives us a reason to do otherwise.
fmt.Errorf("%s path must be %v characters or less", msgPrefix, MaxSocketPathLength)) | ||
} | ||
|
||
sl, err := listenerutil.UnixSocketListener(config.APISocket.Path, config.APISocket.UnixSocketsConfig()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to use this library we should make sure we've fully documented the expected behavior for users. You can set user
by name or ID, but group
only by ID (well, that's how it's documented but not how it behaves if you look at the source). If you set path
without setting user
, group
, and mode
, the socket is owned by the Nomad agent's user and the mode will be determined by the umask (which is usually different for login shells vs services, etc.)
Maybe we should have an explicit default configuration here rather than relying on the default from the library? i feel like that's likely to get changed out from under us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on observing the differences in implementation and documentation on that function. What are your thoughts on a reasonable default configuration, or are you saying that we should just codify the current defaults in Nomad so that if they do change in the library that we won't be surprised/broken by it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current defaults from the library seem reasonable, so let's codify those defaults in Nomad so that we control them going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can ensure the behavior around user and group by making the defaults concrete, but the behavior around mode
is a moving target since it's solely determined by how the library function handles that argument. I did add a test to validate whether or not the behavior of the library call has changed from we're expecting. Seem okay?
if b.Path != "" { | ||
result.Path = b.Path | ||
} | ||
if b.User != "" { | ||
result.User = b.User | ||
} | ||
if b.Group != "" { | ||
result.Group = b.Group | ||
} | ||
if b.Mode != "" { | ||
result.Mode = b.Mode | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field merges can be 1-liners with https://github.com/hashicorp/nomad/blob/main/helper/funcs.go#L492-L500
if s != nil { | ||
result = *s | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just declared Copy()
above, let's use that instead of implementing it twice
This PR pairs with #16872 to provide the server side configuration and implementation for serving the Nomad API over unix domain sockets.
For local testing, you can use
socat
to provide a suitable proxynomad agent -dev -api-socket-path /tmp/run/nomad.sock & socat -d -d TCP4-LISTEN:14646,reuseaddr,fork UNIX:/tmp/run/nomad.sock
In another terminal
export NOMAD_ADDR=http://127.0.0.1:14646 nomad namespace list
Contents:
Fixes: #1639