-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add health checker #119
Add health checker #119
Conversation
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
machine.go
Outdated
@@ -707,6 +707,13 @@ func (m *Machine) UpdateGuestDrive(ctx context.Context, driveID, pathOnHost stri | |||
return nil | |||
} | |||
|
|||
// IsAlive makes sure that a Firecracker instance is running and responds via API by | |||
// calling `DescribeInstance` endpoint behind. | |||
func (m *Machine) IsAlive(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like one of the valid states that can be returned by DescribeInstance
is Halted
: https://github.com/firecracker-microvm/firecracker/blob/35b73437641122865b1603bd74c144035e1790c9/api_server/swagger/firecracker.yaml#L453
There's also Uninitialized
. That makes me feel like IsAlive
is not really what one would expect. It seems more like it's checking whether the VM manager is alive rather than what we'd consider the Machine
.
If we want to keep the behavior as is, I can't think of any great names, but maybe something like Exists
would be better than IsAlive
?
On the other hand, would it make sense to change the behavior and return an error here if the State is Halted
or Uninitialized
? What's the actual motivation behind this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I accidentally clicked the approve button, I think this should be addressed before merging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(@sipsma - I've "dismissed" your review per your comment here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions for names: IsVMMAlive
, IsVMMResponsive
, Reachable
"I accidentally clicked the approve button, I think this should be addressed before merging" - @sipsma
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
return err | ||
} | ||
|
||
state := StringValue(instance.Payload.State) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to verify the state if the purpose of this method is to see whether Firecracker responds to an API. I think getting any non-error response would meet the success criteria.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the motivation behind the change is to test the responsiveness of the API, then agreed and also think IsVMMResponsive
is a good name.
If the motivation is instead to test that the Machine is in the running state, then can we just add a State
function that returns the state value, whatever it may be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haikuoliu which behavior would suite better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I've renamed it to IsRunning
which tells whether a VM is in Starting
or Running
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit confusing to see that we have Running
as a state, and IsRunning
actually means IsRunningOrStarting
. Naming is hard...
I'm inclined to expose Firecracker's state as is instead of introducing our own notion of the state (running, responsive, alive, ...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status type could provide rich methods later, like https://godoc.org/inet.af/http#Status.IsClientError (inet.af is bradfitz's net/http rework proposal, which is referenced from golang/go#29652)
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
Closing until we figure out requirements. |
Signed-off-by: Maksym Pavlenko makpav@amazon.com
Issue #, if available:
#115
Description of changes:
Adds
IsAlive
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.