Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Alerta event property configurable #461

Merged
merged 2 commits into from
Apr 25, 2016
Merged

Make Alerta event property configurable #461

merged 2 commits into from
Apr 25, 2016

Conversation

satterly
Copy link
Contributor

@satterly satterly commented Apr 13, 2016

Make the Alerta event property configurable to ensure de-duplication and event correlation work as expected.

  • CHANGELOG.md updated
  • Rebased/mergable
  • Tests pass
  • Sign CLA (if not already signed)

@satterly satterly changed the title Alerta event property Make Alerta event property configurable Apr 13, 2016
@nathanielc
Copy link
Contributor

So I am a little confused. Kapacitor uses its ID field to correlate different alerts. Whats the value in allowing the event field to be defined separately from the ID?

As I understand the Alerta docs if I send these requests to Alerta:

event=kapacitor/host/example.com,resource=ping,severity=warning
event=kapacitor/host/example.com,resource=ping,severity=warning
event=kapacitor/host/example.com,resource=ping,severity=critical
event=kapacitor/host/example.com,resource=ping,severity=ok

The second warnings is de-duped and dropped.
Then the next critical request correlates with the existing warning and updates it to a critical.
Finally the last request clears the alert with OK.

If either event of resource were different than they would be separate alerts entirely.

Correct?

@satterly
Copy link
Contributor Author

satterly commented Apr 16, 2016

Your understanding of how Alerta de-duplicates and correlates is spot on and yes, the integration between Kapacitor and Alerta could be made to work this way. However, it would be quite limiting.

From your example above, it would make much more sense for the resource to be "kapacitor/host/example.com" and the event to be "ping" but that isn't possible because event is hardcoded to the Kapacitor ID.

Think of the combination of event-resource as the equivalent of the Kapitor ID so for the integration with Alerta to be fully realised you need to split the Kapacitor ID somewhere and assign one part to event and the other part to resource. That it is possible to assign values to the resource field but not possible to assign values to the event field seems wrong because they are equally as important to Alerta.

It probably makes sense to default the event to Kapacitor ID instead of TaskName though.

@nathanielc
Copy link
Contributor

@satterly Currently the event is the ID property and resource is configured independently. So both are configurable like so:

|alert()
    .id('custom event {{ index .Tags "key"}}')
    .resource('custom resource {{ index .Tags "someotherkey"}}')
    .alerta()

Would that not work? Is it just a mater of better documentation?

@satterly
Copy link
Contributor Author

Wouldn't that mean that alert generation in Kapacitor wouldn't work properly because someone wanting to integrate with Alerta would only want to put event name in the Kap ID so it would end up only generating one alert per alert definition for many different resources.

That is, the Kap ID needs to make reference to a resource doesn't it?

@nathanielc
Copy link
Contributor

@satterly You are right. It would break Kapacitor's concept of an ID if they were also using another alerting system in addition to Alerta. Probably not a common use case but it would definitely be an obscure/confusing behavior.

OK, so lets make this change as it make sense now. Just a few things...

  1. Like you said default event to the ID
  2. Provide only the idInfo struct when executing the event template so the user cannot create an event with data that changes depending on the state of the alert.

Thanks!

@satterly
Copy link
Contributor Author

@nathanielc will do. And thanks for the feedback. The discussion was very useful in clarifying the approach.

@@ -0,0 +1,22 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove these .idea files?

@nathanielc
Copy link
Contributor

Here is the build failure output:

--- FAIL: TestStream_AlertAlerta (0.01s)
    streamer_test.go:3536: digraph TestStream_Alert {
        stream0 -> from1;
        from1 -> window2;
        window2 -> count3;
        count3 -> alert4;
        }
    streamer_test.go:2532: unexpected url got /alert?api-key=anothertesttoken exp /alert?api-key=testtoken1234567
    streamer_test.go:2535: unexpected resource got serverA exp cpu
    streamer_test.go:2538: unexpected event got event: TestStream_Alert exp TestStream_Alert
    streamer_test.go:2541: unexpected environment got Development exp Production
    streamer_test.go:2544: unexpected group got serverA exp host=serverA,
    streamer_test.go:2547: unexpected value got 10 exp 
    streamer_test.go:2550: unexpected service got [serviceA serviceB] exp [cpu]
    streamer_test.go:2553: unexpected origin got override exp Kapacitor
    streamer_test.go:2582: unexpected event got event: TestStream_Alert exp TestStream_Alert
    streamer_test.go:2635: unexpected requestCount got 1 exp 2

Have you been able to get the tests to run locally? You can use this comand to run only the Alerta tests.

go test -run TestStream_AlertAlerta -v ./integrations

@satterly
Copy link
Contributor Author

satterly commented Apr 22, 2016

When I run the tests I get ...

E! failed to evaluate Alerta Event template {{ .ID }}

This would be expected because we only provide the idInfo struct to the event template and the idInfo struct doesn't seem to have access to .ID. Confusingly we also made .ID the default.

@@ -907,6 +913,14 @@ func (a *AlertNode) handleAlerta(alerta alertaHandler, ad *AlertData) {
resource := buf.String()
buf.Reset()

err = alerta.eventTmpl.Execute(&buf, ad.info)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right here you can create a new type like:

type eventData struct {
      idInfo
      ID string
}
data := eventData{
     idInfo: ad.info.messageInfo.idInfo,
     ID: ad.ID,
}
err = alerta.eventTmpl.Execute(&buf, data)

This way on the ID Info data can be used in addition to the ID itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated.

This is so that the user cannot create an event with data that
changes depending on the state of the alert.
@nathanielc
Copy link
Contributor

@satterly Looks great, Thanks!

@nathanielc nathanielc merged commit f1a188b into influxdata:master Apr 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants