This repository has been archived by the owner on Jan 8, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 327
Waypoint server panics when runner is forgotten before stopped #3448
Labels
Comments
It'll be useful to audit the runner code path for runners when a runner is forgotten/deleted from bolt. There are likely bugs here since the adoption flow is recently new, so there are probably places that we need to fix up some bad behaviors. edit: It might also be worth to see if there's any runner logic to extract into something more generic outside of the boltdb server implementation. |
demophoon
added a commit
that referenced
this issue
Aug 29, 2022
Before this commit when Waypoint was determining whether or not to remove a Runner from boltdb it was possible for runner to be nil at the time we attempted to determine what type of Runner the Runner was. This caused the server to panic as soon as the runner became unavailable. This commit fixes the panic by avoiding the runner from being set to nil by instead initializing an empty runner variable so that if a runner is not found the type can still be determined and the runner cleaned up. Fixes #3448
demophoon
added a commit
that referenced
this issue
Aug 29, 2022
Before this commit when Waypoint was determining whether or not to remove a Runner from boltdb it was possible for runner to be nil at the time we attempted to determine what type of Runner the Runner was. This caused the server to panic as soon as the runner became unavailable. This commit fixes the panic by avoiding the runner from being set to nil by instead initializing an empty runner variable so that if a runner is not found the type can still be determined and the runner cleaned up. Fixes #3448
demophoon
added a commit
that referenced
this issue
Aug 30, 2022
Before this commit when Waypoint was determining whether or not to remove a Runner from boltdb it was possible for runner to be nil at the time we attempted to determine what type of Runner the Runner was. This caused the server to panic as soon as the runner became unavailable. This commit fixes the panic by checking if we received a runner from the database before determining its type. Fixes #3448
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Describe the bug
When a Waypoint runner is forgotten before it is stopped, when that runner is stopped the Waypoint server panics.
I believe this happens because when the runner is forgotten, the runner record is removed from boltdb. Once the Waypoint runner is stopped the server notices and attempts to mark the runner as offline. During that process, server attempts to fetch the runner from BoltDB.
waypoint/internal/server/boltdbstate/runner.go
Lines 304 to 312 in 822d3ca
If either
r
is set tonil
orcodes.NotFound
is the error that is returned,r
isnil
by the time we attempt to determine whatKind
of runner we are dealing with.waypoint/internal/server/boltdbstate/runner.go
Line 322 in 822d3ca
In this
r = nil
case the server panics and exits.Steps to Reproduce
Steps to reproduce the behavior.
At this point the server will have panicked.
Expected behavior
The server should continue as if the runner had always been forgotten in the first place.
Waypoint Platform Versions
The text was updated successfully, but these errors were encountered: