core: merge reserved_ports into host_networks #13651

schmichael · 2022-07-08T00:19:56Z

This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports).

As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility:

Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly.
Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me.

So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.

Sorry I snuck in a couple other refactorings. I really want to make this code more maintainable, so I tried to move it in that direction where I didn't think it would be a huge distraction (eg the interface{} -> string switch). I can back out any of that if you think it's best to keep this tight and focused.

Fixes #13505 WIP

schmichael · 2022-07-08T01:28:05Z

nomad/structs/network.go

+// Node object handling by servers. Users should not be able to cause SetNode
+// to error. Data that cause SetNode to error should be caught upstream such as
+// a client agent refusing to start with an invalid configuration.
+func (idx *NetworkIndex) SetNode(node *Node) error {


It's easy to miss in the diff, so I want to call this refactoring out:

SetNode used to return (collide bool, reason string) just like AddAllocs. Functionally returning an error really isn't any different, but I wanted to signal something different to developers: SetNode should never have a "collision!"

AddAllocs can have collisions. It's a normal part of trying to make placements and preemptions and failing. There's nothing "erroneous" about that.

SetNode on the the other hand should never return an error at runtime. A better way to put it is: anything that SetNode considers an error could have been caught upstream (like on the client agent) through validation. I thought the old call signature really sent the wrong message that SetNode collisions could Just Happen as a normal part of an optimistically concurrent scheduler.

This is great. I wonder now what kind of actions we can take when an error is returned here. Maybe force a leadership transition to see if a new leader is able to handle the node? Or have a way to flush node state and request new fingerprint data from the node?

This implication of "SetNode on the the other hand should never return an error at runtime" is that the node itself is giving us bad data because of a programmer error, so I'd expect the node to give us the same bad fingerprint again. At that point our invariants are broken so I'm not sure it's a good idea to try to recover rather than throw an error that fails scheduling -- we want this to be noisy.

Yeah, a very sad +1 to Tim. Although in cases we've encountered such as #13505 and #11830 it is possible for an operator to change their configuration to fix or workaround the "collision." How to get folks from this obscure error message to inspecting things like reserved ports seems extremely difficult though. The silver lining is that folks are filing issues and these errors are much easier to track down than collisions caused by allocations.

tgross · 2022-07-08T13:00:13Z

nomad/structs/network.go

@@ -51,9 +56,9 @@ type NetworkIndex struct {
 // NewNetworkIndex is used to construct a new network index
 func NewNetworkIndex() *NetworkIndex {
 	return &NetworkIndex{
-		AvailAddresses: make(map[string][]NodeNetworkAddress),
-		AvailBandwidth: make(map[string]int),
+		HostNetworks:   make(map[string][]NodeNetworkAddress),


At first I was thinking it would be safer to create the TaskNetworks and GroupNetworks slices here too, but I'm realizing by leaving them out we effectively enforce that SetNode has been called before use (on pain of panic). So 👍 here

scheduler/propertyset.go

tgross · 2022-07-08T13:10:16Z

scheduler/rank.go

 				NetIndex: netIdx.Copy(),
 				Node:     option.Node,
 			})
-			iter.ctx.Metrics().ExhaustedNode(option.Node, "network: port collision")
+			iter.ctx.Metrics().ExhaustedNode(option.Node, "network: invalid node")


This is great: the refactoring makes the reasoning more obvious to us as developers, but changing the message here will also make this case stand out loudly in plan metrics when we try to debug it.

nomad/structs/network.go

schmichael · 2022-07-09T00:18:47Z

I spent a long time playing with this today and wanted to share some findings:

A key aspect of Nomad's networking is that ports are reserved by IPs, not "networks" whether defined with network_interface or host_networks. This makes sense as (IP:Port) tuples are what matter at the end of the day. A network device can't have a port reservation but IP addresses on that device can.

So I realized what probably would be the most correct approach to reserved_port handling is:

client.reserved.reserved_ports reserves ports on the IP addresses associated with the default network (this is the network defined by network_interface)
client.host_networks[].reserved_ports reserve ports on the IP addresses defined by their CIDR or device.

This means port reservations may or may not overlap. For example:

The default network is 192.168.1.1 and reserves port 22
Host network eth0 is 192.168.1.0/24 and reserves port 80

The end result would be:

192.168.1.1 has ports 22 and 80 reserved!
The rest of the 192.168.1.0/24 subnet would have only port 80 reserved.

But this isn't how existing code works, and all of the comments and docs seem to treat client.reserved.reserved_ports the way this PR does: as the global port reservations. This is conceptually simpler but does mean you can't reserve a port on the default network and not on a host_network. "Fixing" that behavior would be a backward compatibility issue.

That being said the output of nomad node status -verbose lists Host Networks without their global port reservations! So if the intended behavior is that they inherit reserved ports, we probably need to fix that as well.

command/agent/agent.go

command/agent/command.go

nomad/structs/network.go

lgfa29 · 2022-07-12T00:10:12Z

nomad/structs/network.go

+// Node object handling by servers. Users should not be able to cause SetNode
+// to error. Data that cause SetNode to error should be caught upstream such as
+// a client agent refusing to start with an invalid configuration.
+func (idx *NetworkIndex) SetNode(node *Node) error {


This is great. I wonder now what kind of actions we can take when an error is returned here. Maybe force a leadership transition to see if a new leader is able to handle the node? Or have a way to flush node state and request new fingerprint data from the node?

nomad/structs/network.go

nomad/structs/network_test.go

tgross

Once the must in tests are fixed up this LGTM

website/content/docs/configuration/client.mdx

tgross · 2022-07-12T14:08:13Z

nomad/structs/network.go

+// Node object handling by servers. Users should not be able to cause SetNode
+// to error. Data that cause SetNode to error should be caught upstream such as
+// a client agent refusing to start with an invalid configuration.
+func (idx *NetworkIndex) SetNode(node *Node) error {


This implication of "SetNode on the the other hand should never return an error at runtime" is that the node itself is giving us bad data because of a programmer error, so I'd expect the node to give us the same bad fingerprint again. At that point our invariants are broken so I'm not sure it's a good idea to try to recover rather than throw an error that fails scheduling -- we want this to be noisy.

Co-authored-by: Tim Gross <tgross@hashicorp.com>

lgfa29

Nice changes! The added comments in the structs are very helpful.

I added the release/1.1.x and release/1.2.x labels for backport.

@groggemans

Fixes #13505 This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports). As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility: Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly. Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me. So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR.

@groggemans

Fixes #13505 This fixes #13505 by treating reserved_ports like we treat a lot of jobspec settings: merging settings from more global stanzas (client.reserved.reserved_ports) "down" into more specific stanzas (client.host_networks[].reserved_ports). As discussed in #13505 there are other options, and since it's totally broken right now we have some flexibility: Treat overlapping reserved_ports on addresses as invalid and refuse to start agents. However, I'm not sure there's a cohesive model we want to publish right now since so much 0.9-0.12 compat code still exists! We would have to explain to folks that if their -network-interface and host_network addresses overlapped, they could only specify reserved_ports in one place or the other?! It gets ugly. Use the global client.reserved.reserved_ports value as the default and treat host_network[].reserverd_ports as overrides. My first suggestion in the issue, but @groggemans made me realize the addresses on the agent's interface (as configured by -network-interface) may overlap with host_networks, so you'd need to remove the global reserved_ports from addresses shared with a shared network?! This seemed really confusing and subtle for users to me. So I think "merging down" creates the most expressive yet understandable approach. I've played around with it a bit, and it doesn't seem too surprising. The only frustrating part is how difficult it is to observe the available addresses and ports on a node! However that's a job for another PR. Co-authored-by: Michael Schurter <mschurter@hashicorp.com>

github-actions · 2022-11-10T02:34:22Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

schmichael added this to the 1.3.2 milestone Jul 8, 2022

schmichael requested review from lgfa29 and tgross July 8, 2022 00:19

core: merge reserved_ports into host_networks

d1b652c

Fixes #13505 WIP

schmichael force-pushed the b-fix-res-ports branch from 7933add to d1b652c Compare July 8, 2022 01:24

vercel bot deployed to Preview – nomad-storybook-and-ui July 8, 2022 01:27 View deployment

schmichael commented Jul 8, 2022

View reviewed changes

tgross reviewed Jul 8, 2022

View reviewed changes

Get structs tests passing

30584d4

schmichael force-pushed the b-fix-res-ports branch from 6c266f9 to 30584d4 Compare July 8, 2022 23:22

vercel bot deployed to Preview – nomad July 8, 2022 23:27 View deployment

vercel bot deployed to Preview – nomad-storybook-and-ui July 8, 2022 23:28 View deployment

schmichael added 2 commits July 11, 2022 11:31

Add more complete unit test

2a1b420

Simply getProperty further

2bb1c4e

vercel bot deployed to Preview – nomad-storybook-and-ui July 11, 2022 18:40 View deployment

Fix tests to match new API + behavior

462fa16

vercel bot deployed to Preview – nomad-storybook-and-ui July 11, 2022 21:45 View deployment

Test invalid node path

5a08ca3

schmichael marked this pull request as ready for review July 11, 2022 21:56

vercel bot deployed to Preview – nomad-storybook-and-ui July 11, 2022 21:59 View deployment

Add a changelog entry

ffdcd60

vercel bot deployed to Preview – nomad-storybook-and-ui July 11, 2022 22:06 View deployment

schmichael added the backport/1.3.x backport to 1.3.x release line label Jul 11, 2022

lgfa29 reviewed Jul 12, 2022

View reviewed changes

tgross approved these changes Jul 12, 2022

View reviewed changes

schmichael and others added 3 commits July 12, 2022 09:19

Remove embarrassing git miscommits

68709dd

Remove must

f9042dc

Fix typo on website/content/docs/configuration/client.mdx

1b497b8

Co-authored-by: Tim Gross <tgross@hashicorp.com>

vercel bot deployed to Preview – nomad July 12, 2022 16:27 View deployment

vercel bot deployed to Preview – nomad-storybook-and-ui July 12, 2022 16:31 View deployment

Expand NetworkIndex struct comments

bdd2d86

vercel bot deployed to Preview – nomad-storybook-and-ui July 12, 2022 16:40 View deployment

More comment fixing

83b31f4

schmichael requested a review from lgfa29 July 12, 2022 16:55

vercel bot deployed to Preview – nomad-storybook-and-ui July 12, 2022 16:56 View deployment

Tests should probably at least compile

7fdcd6f

vercel bot deployed to Preview – nomad-storybook-and-ui July 12, 2022 18:57 View deployment

lgfa29 approved these changes Jul 12, 2022

View reviewed changes

lgfa29 added backport/1.1.x backport to 1.1.x release line backport/1.2.x backport to 1.1.x release line labels Jul 12, 2022

schmichael merged commit f998a2b into main Jul 12, 2022

schmichael deleted the b-fix-res-ports branch July 12, 2022 21:40

github-actions bot locked as resolved and limited conversation to collaborators Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: merge reserved_ports into host_networks #13651

core: merge reserved_ports into host_networks #13651

schmichael commented Jul 8, 2022 •

edited

Loading

schmichael Jul 8, 2022

lgfa29 Jul 12, 2022

tgross Jul 12, 2022

schmichael Jul 12, 2022

tgross Jul 8, 2022

tgross Jul 8, 2022

schmichael commented Jul 9, 2022

lgfa29 Jul 12, 2022

tgross left a comment

tgross Jul 12, 2022

lgfa29 left a comment

github-actions bot commented Nov 10, 2022

core: merge reserved_ports into host_networks #13651

core: merge reserved_ports into host_networks #13651

Conversation

schmichael commented Jul 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schmichael commented Jul 9, 2022

Choose a reason for hiding this comment

tgross left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lgfa29 left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 10, 2022

schmichael commented Jul 8, 2022 •

edited

Loading