Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allows disabling WAN federation by setting serf WAN port to -1 #3984

Merged
merged 4 commits into from
Mar 27, 2018

Conversation

preetapan
Copy link
Contributor

@preetapan preetapan commented Mar 26, 2018

This PR enables disabling WAN federation by setting the serf WAN port to -1 in the config. A number of endpoints/features are affected by this, listed below in no particular order.

Copy link
Member

@banks banks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to dig into all the references for things that might become nil now and every trail ended in something you checked here so I don't see any obvious bugs.

I left some questions inline about whether there is a good reason not to return errors when WAN is disabled and we try to change WAN related things in a few places, but you have more context on how they are used . In general where it doesn't have a direct negative effect on code or UX that would need lots of extra work to undo I prefer to explicitly fail rather than silently skip stuff in cases like this.

@@ -1021,6 +1047,10 @@ func (s *Server) GetLANCoordinate() (lib.CoordinateSet, error) {

// GetWANCoordinate returns the coordinate of the server in the WAN gossip pool.
func (s *Server) GetWANCoordinate() (*coordinate.Coordinate, error) {
if s.serfWAN == nil {
// Return zero values if WAN federation is disabled
return &coordinate.Coordinate{}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right thing to do vs ErrWANFederationDisabled may well be as I'm not familiar with the ways it is called but it's not obvious to me why we'd want to pretend this still works when WAN is disabled.

@@ -116,6 +116,9 @@ func (r *Router) Shutdown() {

// AddArea registers a new network area with the router.
func (r *Router) AddArea(areaID types.AreaID, cluster RouterSerfCluster, pinger Pinger, useTLS bool) error {
if cluster == nil {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is returning an error appropriate here given that we guard the call in NewServerLogger when cluster is nil anyway? Maybe there are other callers that would fail if it were an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to be defensive here but its better to panic if there are new call sites added for this method in the future, will remove

@@ -37,7 +37,9 @@ func (s *Server) setupSerf(conf *serf.Config, ch chan serf.Event, path string, w
conf.NodeName = fmt.Sprintf("%s.%s", s.config.NodeName, s.config.Datacenter)
} else {
conf.NodeName = s.config.NodeName
conf.Tags["wan_join_port"] = fmt.Sprintf("%d", wanPort)
if wanPort > 0 {
conf.Tags["wan_join_port"] = fmt.Sprintf("%d", wanPort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I checked and it seems the only consumer of that tag in our code already handles the case where it's not present:

wanJoinPortStr, ok := m.Tags["wan_join_port"]
if ok {
wanJoinPort, err = strconv.Atoi(wanJoinPortStr)
if err != nil {
return false, nil
}
}

@@ -872,12 +889,19 @@ func (s *Server) KeyManagerLAN() *serf.KeyManager {

// KeyManagerWAN returns the WAN Serf keyring manager
func (s *Server) KeyManagerWAN() *serf.KeyManager {
if s.serfWAN == nil {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record I verified all call paths to this to check none will panic on a nil. It's only called from an internal state machine routine when working with WAN keyrings which should be disabled by code in this PR.

// executeKeyringOp executes the keyring-related operation in the request
// on either the WAN or LAN pools.
func (m *Internal) executeKeyringOp(
args *structs.KeyringRequest,
reply *structs.KeyringResponses,
wan bool) {
if wan {
mgr := m.srv.KeyManagerWAN()

called from

// Only perform WAN keyring querying and RPC forwarding once
if !args.Forwarded {
args.Forwarded = true
m.executeKeyringOp(args, reply, true)
return m.srv.globalRPC("Internal.KeyringOperation", args, reply)
}

which is disabled above.

@banks
Copy link
Member

banks commented Mar 27, 2018

--- FAIL: TestExecCommand_CrossDC (1.38s)
	exec_test.go:101: bad: "0 / 0 node(s) completed / acknowledged\n"

In CI looks suspiciously like it might be legit given the change but I think you said you had them passing locally so I hit retry just to see if it fails consistently.

Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few comments 👍

agent/agent.go Outdated
@@ -1019,6 +1024,7 @@ func (a *Agent) setupNodeID(config *config.RuntimeConfig) error {
func (a *Agent) setupBaseKeyrings(config *consul.Config) error {
// If the keyring file is disabled then just poke the provided key
// into the in-memory keyring.
fedarationEnabled := config.SerfWANConfig != nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be "federationEnabled"

}

// Initialize the LAN segments before the default LAN Serf so we have
// updated port information to publish there.
// TODO preetha: why is this passing WAN port to create segments?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like just a remnant of calling the same setupSerf function for segments; the servers are already in the default serf LAN before any segments start, so I don't think the segment Serfs need the WAN port tag set.

// GetWANCoordinate returns the coordinate of the server in the WAN gossip pool.
func (s *Server) GetWANCoordinate() (*coordinate.Coordinate, error) {
return s.serfWAN.GetCoordinate()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this not used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

@preetapan preetapan merged commit 3c96d64 into master Mar 27, 2018
@preetapan preetapan deleted the f-allow-federation-disable branch March 27, 2018 21:06
pearkes added a commit that referenced this pull request Mar 29, 2018
See #3984 for more.

This also fixes a link to a heading that existed twice in the document.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants