Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bosun: Better Error flow #1301

Merged
merged 1 commit into from
Sep 16, 2015
Merged

bosun: Better Error flow #1301

merged 1 commit into from
Sep 16, 2015

Conversation

captncraig
Copy link
Contributor

Currently errors checking alerts get muddled with actual alerts in odd ways. Errors are not tagged, so are often difficult to correlate with alerts and hard to get to go away. This is a complete overhaul of errors.

  1. Errors are no longer a state. We simply track errors that occur separately from alert data.

  2. Dashboard will have a simple summary of active alerts that takes you to the error detail page.

  3. Active alerts on the dashboard will get a flame icon to indicate a current error on the specified alert (error happened more recently than a success). This does not affect your ability to close alerts if the last known state was normal.

    screen shot 2015-09-04 at 3 59 58 pm

  4. Errors must be "closed" before they will disappear from the system. The error page will list all errors that occured, and the number of times they occured. Once closed, they are forgotten entirely.

  5. Errors are coalesced into sets by message. A continual sequence of checks that result in the same error will coalesce into a single line item with a count of events. If a successful check completes, no further errors will coalesce into the same line item on the error page. An entire line item is closed as a unit.

@giganteous
Copy link
Contributor

Seeing the nice fire icon with alerts that turned more emergent it'd be kinda cool to have a fire-extinguisher icon on alerts that turned into recoveries?

@captncraig captncraig changed the title WIP: Better Error flow bosun: Better Error flow Sep 16, 2015
captncraig pushed a commit that referenced this pull request Sep 16, 2015
@captncraig captncraig merged commit 1dcad28 into master Sep 16, 2015
@captncraig captncraig deleted the errorFlow branch September 16, 2015 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants