-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fingerprint manager to manage fingerprinting node #3807
Conversation
02df0f1
to
9cf572a
Compare
client/fingerprint_manager.go
Outdated
|
||
// setupDrivers is used to fingerprint the node to see if these attributes are | ||
// supported | ||
func (fm *FingerprintManager) SetupFingerprints(fingerprints []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better name suggestion - SetupFingerPrinters ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, although I think it should technically be SetupFingerprinters. Good idea.
client/fingerprint_manager.go
Outdated
type FingerprintManager struct { | ||
getConfig func() *config.Config | ||
node *structs.Node | ||
client *Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of giving this access to the whole client struct, consider using a func type to encapsulate what this does with the client. Something like
type FingerPrintResponseUpdater func(resp *structs.FingerPrintResponse)
// in client.go when you initialize the FingerPrintManager..
responseUpdaterFunc := func() {
c.updateNodeFromFingerprint(&response)
}
fingerprintManager := &FingerprintManager{
getConfig: func() *config.Config {
c.configLock.Lock()
defer c.configLock.Unlock()
return c.config
},
node: c.config.Node,
responseUpdater: responseUpdaterFunc,
shutDownCh:c.shutDownch,
logger: c.logger,
}
Doing it this way lets you tighten down exactly what FingerprintManager has access to - it shouldn't need to know about the client or need the whole client struct, instead, it calls a callback function that's passed into it when fingerprint responses are available.
Also see how AllocStateUpdater works for an example of this pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this was something I was considering, but at the time it wasn't clear how much of the client the fingerprint manager needed access to. Adding the response updater and a shutdown channel seem still low overhead, so will change that. Thanks!
client/fingerprint_manager.go
Outdated
continue | ||
} | ||
|
||
case <-fm.client.shutdownCh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShutdownCh can be another member variable passed in the constructor.
client/fingerprint_manager.go
Outdated
} | ||
|
||
// fingerprint does an initial fingerprint of the client. If the fingerprinter | ||
// is meant to be run continuously, a process is launched ro perform this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
launched to
client/fingerprint_manager.go
Outdated
return response.Detected, nil | ||
} | ||
|
||
// setupDrivers is used to fingerprint the node to see if these attributes are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update this comment
|
||
d, err := driver.NewDriver(name, driverCtx) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about error handling during loop iteration - if some drivers worked and one returned an error in line 49, this returns immediatly. It could have already started periodic fingerprinting some drivers before it failed and returned. Is that desirable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should exit fast if there is an error during setting up the client. Otherwise, it should gracefully handle errors during periodic fingerprinting.
client/fingerprint_manager.go
Outdated
} | ||
} | ||
|
||
fm.logger.Printf("[DEBUG] fingerprint_manager: applied fingerprints %v", appliedFingerprints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn;t this be applied fingerprinters in the log message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is called "fingerprints" - I'm ok with either in the log message, but I don't think we should do a full rename of the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the log message should be accurate - it was a source of confusion in a previous review too. Would be fine not renaming the type since that's not visible to users.
CHANGELOG.md
Outdated
@@ -11,6 +11,7 @@ IMPROVEMENTS: | |||
* cli: Use ISO_8601 time format for cli output | |||
[[GH-3814](https://github.com/hashicorp/nomad/pull/3814)] | |||
* client: Allow '.' in environment variable names [[GH-3760](https://github.com/hashicorp/nomad/issues/3760)] | |||
* client: Add a fingerprint manager to abstract node fingerprinting [[GH-3807](https://github.com/hashicorp/nomad/pull/3807)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually internal changes don't make it to the changelog
client/client.go
Outdated
@@ -230,13 +230,25 @@ func NewClient(cfg *config.Config, consulCatalog consul.CatalogAPI, consulServic | |||
return nil, fmt.Errorf("node setup failed: %v", err) | |||
} | |||
|
|||
fingerprintManager := &FingerprintManager{ | |||
getConfig: func() *config.Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getConfig: c.GetConfig()
and just add locking there?
client/fingerprint_manager.go
Outdated
|
||
// updateNode is a callback to the client to update the state of its | ||
// associated node | ||
updateNode func(*cstructs.FingerprintResponse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this maybe return the updated node back? All the fingerprinters will be operating on stale nodes right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I was under the impression that the fingerprint manager just holds a pointer to the original node, which is only updated (not replaced). If this isn't true (or not the pattern that we want) then it can re-assign the node based on what this function returns.
client/fingerprint_manager.go
Outdated
|
||
detected, err := fm.fingerprint(name, d) | ||
if err != nil { | ||
fm.logger.Printf("[DEBUG] fingerprint_manager: fingerprinting for %v failed: %+v", name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client.fingerprint_manager
client/client.go
Outdated
// Fingerprint the node | ||
if err := c.fingerprint(); err != nil { | ||
if err := c.fingerprint(fingerprintManager); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move both of these into the fingerprint manager. Create a Run() error
method that does the initial setup and then calls run
.
client/client.go
Outdated
@@ -230,13 +230,25 @@ func NewClient(cfg *config.Config, consulCatalog consul.CatalogAPI, consulServic | |||
return nil, fmt.Errorf("node setup failed: %v", err) | |||
} | |||
|
|||
fingerprintManager := &FingerprintManager{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a constructor
e0c5e0a
to
bc8150a
Compare
bc8150a
to
1a26950
Compare
client/client_test.go
Outdated
@@ -208,17 +209,19 @@ func TestClient_RPC_Passthrough(t *testing.T) { | |||
|
|||
func TestClient_Fingerprint(t *testing.T) { | |||
t.Parallel() | |||
require := require.New(t) | |||
if _, ok := driver.BuiltinDrivers["mock_driver"]; !ok { | |||
t.Skip(`test requires mock_driver; run with "-tags nomad_test"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this to a driver/testing.go
helper and replace existing uses of this with the shared helper.
client/fingerprint_manager.go
Outdated
} | ||
|
||
if node := fm.updateNode(&response); node != nil { | ||
fm.node = node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be locked. There are concurrent writers and accessors
client/fingerprint_manager.go
Outdated
return nil | ||
} | ||
|
||
func (fp *FingerprintManager) Run() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on the method
lock concurrent accesses to node comment exported method
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
In the future, the fingerprint manager will later also be responsible for updating the node status of a healthy/unhealthy driver.
Considerations: