-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amtool implementation #636
Changes from 27 commits
3ea3e8a
cd0d337
7578ffc
f33bac9
b2ab06b
be80a7e
02c4ac2
ba742c8
e61d360
605c403
fbd99d9
0d534c2
89869e7
a0bab45
1d8dbf0
513312f
8915d2f
a22ac3d
d9413d0
d1dec7e
360a631
2199e68
8f614b0
8e8f364
597a362
c7e603a
f5478c6
e523382
f536f83
eaf6249
fbdc789
0aa1b09
7326c32
576e643
70316d8
fba0793
32a8973
aa47c41
e3caaa8
7d6a860
5d0bba4
ac50b07
507b32b
dabbd06
7f3ce6d
0da7098
ffb6747
e56573b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
/data/ | ||
/alertmanager | ||
/amtool | ||
*.yml | ||
*.yaml | ||
/.build | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Generating amtool artifacts | ||
|
||
Amtool comes with the option to create a number of eaze-of-use artifacts that can be created. | ||
|
||
go run generate_amtool_artifacts.go | ||
|
||
## Bash completion | ||
|
||
The bash completion file can be added to `/etc/bash_completion.d/`. | ||
|
||
## Man pages | ||
|
||
Man pages can be added to the man directory of your choice | ||
|
||
cp artifacts/*.1 /usr/local/share/man/man1/ | ||
sudo mandb | ||
|
||
Then you should be able to view the man pages as expected. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
package main | ||
|
||
import ( | ||
"github.com/spf13/cobra/doc" | ||
|
||
"github.com/prometheus/alertmanager/cli" | ||
) | ||
|
||
func main() { | ||
cli.RootCmd.GenBashCompletionFile("amtool_completion.sh") | ||
header := &doc.GenManHeader{ | ||
Title: "amtool", | ||
Section: "1", | ||
} | ||
|
||
doc.GenManTree(cli.RootCmd, header, ".") | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
package cli | ||
|
||
import ( | ||
"encoding/json" | ||
"errors" | ||
"net/http" | ||
"path" | ||
"time" | ||
|
||
"github.com/prometheus/alertmanager/cli/format" | ||
"github.com/prometheus/alertmanager/types" | ||
"github.com/prometheus/common/model" | ||
"github.com/spf13/cobra" | ||
flag "github.com/spf13/pflag" | ||
"github.com/spf13/viper" | ||
) | ||
|
||
type alertmanagerAlertResponse struct { | ||
Status string `json:"status"` | ||
Data model.Alerts `json:"data,omitempty"` | ||
ErrorType string `json:"errorType,omitempty"` | ||
Error string `json:"error,omitempty"` | ||
} | ||
|
||
var alertFlags *flag.FlagSet | ||
|
||
// alertCmd represents the alert command | ||
var alertCmd = &cobra.Command{ | ||
Use: "alert", | ||
Short: "View and search through current alerts", | ||
Long: `View and search through current alerts. | ||
|
||
Amtool has a simplified prometheus query syntax, but contains robust support for | ||
bash variable expansions. The non-option section of arguments constructs a list | ||
of "Matcher Groups" that will be used to filter your query. The following | ||
examples will attempt to show this behaviour in action: | ||
|
||
amtool alert query alertname=foo node=bar | ||
|
||
This query will match all alerts with the alertname=foo and node=bar label | ||
value pairs set. | ||
|
||
amtool alert query foo node=bar | ||
|
||
If alertname is ommited and the first argument does not contain a '=' or a | ||
'=~' then it will be assumed to be the value of the alertname pair. | ||
|
||
amtool alert query 'alertname=~foo.*' | ||
|
||
As well as direct equality, regex matching is also supported. The '=~' syntax | ||
(similar to prometheus) is used to represent a regex match. Regex matching | ||
can be used in combination with a direct match. | ||
|
||
amtool alert query alertname=foo node={bar,baz} | ||
|
||
This query will match all alerts with the alertname=foo label value pair | ||
and EITHER node=bar or node=baz. | ||
|
||
amtool alert query alertname=foo{a,b} node={bar,baz} | ||
|
||
Similar to the previous example this query will match all alerts with any | ||
combination of alertname=fooa or alertname=foob AND node=bar or node=baz. | ||
|
||
`, | ||
RunE: queryAlerts, | ||
} | ||
|
||
func init() { | ||
RootCmd.AddCommand(alertCmd) | ||
alertCmd.Flags().Bool("expired", false, "Show expired alerts as well as active") | ||
alertCmd.Flags().BoolP("silenced", "s", false, "Show silenced alerts") | ||
alertFlags = alertCmd.Flags() | ||
} | ||
|
||
func fetchAlerts() (model.Alerts, error) { | ||
alertResponse := alertmanagerAlertResponse{} | ||
|
||
u, err := GetAlertmanagerURL() | ||
if err != nil { | ||
return model.Alerts{}, err | ||
} | ||
|
||
u.Path = path.Join(u.Path, "/api/v1/alerts") | ||
res, err := http.Get(u.String()) | ||
if err != nil { | ||
return model.Alerts{}, err | ||
} | ||
|
||
defer res.Body.Close() | ||
decoder := json.NewDecoder(res.Body) | ||
|
||
err = decoder.Decode(&alertResponse) | ||
if err != nil { | ||
return model.Alerts{}, errors.New("Unable to decode json response") | ||
} | ||
return alertResponse.Data, nil | ||
} | ||
|
||
func queryAlerts(cmd *cobra.Command, args []string) error { | ||
alerts, err := fetchAlerts() | ||
if err != nil { | ||
return err | ||
} | ||
|
||
silences, err := fetchSilences() | ||
if err != nil { | ||
return err | ||
} | ||
|
||
var groups []types.Matchers | ||
if len(args) > 0 { | ||
matchers, err := parseMatchers(args, RESOLVE_FUZZY) | ||
if err != nil { | ||
return err | ||
} | ||
groups = parseMatcherGroups(matchers) | ||
if err != nil { | ||
return err | ||
} | ||
} | ||
|
||
expired, err := alertFlags.GetBool("expired") | ||
if err != nil { | ||
return err | ||
} | ||
|
||
showSilenced, err := alertFlags.GetBool("silenced") | ||
if err != nil { | ||
return err | ||
} | ||
|
||
displayAlerts := model.Alerts{} | ||
for _, alert := range alerts { | ||
// If we are only returning current alerts and this one has already expired skip it | ||
if !expired { | ||
if !alert.EndsAt.IsZero() && alert.EndsAt.Before(time.Now()) { | ||
continue | ||
} | ||
} | ||
|
||
if !showSilenced { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This filtering is something I've also implemented in the new alertmanager-ui. Seems like we should add it as a query string param to the url and do the filtering on the initial query, instead of duplicating our work in different consumers of the api? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. I kinda felt like it was a pain doing all this by hand in the cli tool There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The functionality is already available here: https://github.com/prometheus/alertmanager/blob/master/dispatch/dispatch.go#L132-L134 All that needs to be done is parsing a |
||
// If any silence mutes this alert don't show it | ||
silenced := false | ||
for _, silence := range silences { | ||
// Need to call Init before Mutes | ||
err = silence.Init() | ||
if err != nil { | ||
return err | ||
} | ||
|
||
if silence.Mutes(alert.Labels) { | ||
silenced = true | ||
break | ||
} | ||
} | ||
if silenced { | ||
continue | ||
} | ||
} | ||
|
||
// If the user hasn't specified and match groups then let it through | ||
if len(groups) < 1 { | ||
displayAlerts = append(displayAlerts, alert) | ||
continue | ||
} | ||
|
||
for _, matchers := range groups { | ||
if matchers.Match(alert.Labels) { | ||
displayAlerts = append(displayAlerts, alert) | ||
break | ||
} | ||
} | ||
} | ||
|
||
formatter, found := format.Formatters[viper.GetString("output")] | ||
if !found { | ||
return errors.New("Unknown output formatter") | ||
} | ||
return formatter.FormatAlerts(displayAlerts) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
package cli | ||
|
||
import ( | ||
"encoding/json" | ||
"errors" | ||
"net/http" | ||
"path" | ||
"time" | ||
|
||
"github.com/prometheus/alertmanager/cli/format" | ||
"github.com/prometheus/alertmanager/config" | ||
"github.com/spf13/cobra" | ||
//flag "github.com/spf13/pflag" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. commented out import |
||
"github.com/spf13/viper" | ||
) | ||
|
||
// Config is the response type of alertmanager config endpoint | ||
// Duped in cli/format needs to be moved to common/model | ||
type Config struct { | ||
Config string `json:"config"` | ||
ConfigJSON config.Config `json:configJSON` | ||
VersionINFO map[string]string `json:"versionInfo"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the go naming convention for non-acronyms would have this be |
||
Uptime time.Time `json:"uptime"` | ||
} | ||
|
||
type alertmanagerStatusResponse struct { | ||
Status string `json:"status"` | ||
Data Config `json:"data,omitempty"` | ||
ErrorType string `json:"errorType,omitempty"` | ||
Error string `json:"error,omitempty"` | ||
} | ||
|
||
// alertCmd represents the alert command | ||
var configCmd = &cobra.Command{ | ||
Use: "config", | ||
Short: "View the running config", | ||
Long: `View current config | ||
|
||
The amount of output is controlled by the output selection flag: | ||
- Simple: Print just the running config | ||
- Extended: Print the running config as well as uptime and all version info | ||
- Json: Print entire config object as json`, | ||
RunE: queryConfig, | ||
} | ||
|
||
func init() { | ||
RootCmd.AddCommand(configCmd) | ||
} | ||
|
||
func fetchConfig() (Config, error) { | ||
configResponse := alertmanagerStatusResponse{} | ||
u, err := GetAlertmanagerURL() | ||
if err != nil { | ||
return Config{}, err | ||
} | ||
|
||
u.Path = path.Join(u.Path, "/api/v1/status") | ||
res, err := http.Get(u.String()) | ||
if err != nil { | ||
return Config{}, err | ||
} | ||
|
||
defer res.Body.Close() | ||
decoder := json.NewDecoder(res.Body) | ||
|
||
err = decoder.Decode(&configResponse) | ||
if err != nil { | ||
return Config{}, err | ||
} | ||
|
||
return configResponse.Data, nil | ||
} | ||
|
||
func queryConfig(cmd *cobra.Command, args []string) error { | ||
config, err := fetchConfig() | ||
if err != nil { | ||
return err | ||
} | ||
|
||
formatter, found := format.Formatters[viper.GetString("output")] | ||
if !found { | ||
return errors.New("Unknown output formatter") | ||
} | ||
|
||
return formatter.FormatConfig(format.Config(config)) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
package format | ||
|
||
import ( | ||
"io" | ||
"time" | ||
|
||
"github.com/prometheus/alertmanager/config" | ||
"github.com/prometheus/alertmanager/types" | ||
"github.com/prometheus/common/model" | ||
"github.com/spf13/viper" | ||
) | ||
|
||
const DefaultDateFormat = "2006-01-02 15:04:05 MST" | ||
|
||
// Config representation | ||
// Need to get this moved to the prometheus/common/model repo having is duplicated here is smelly | ||
type Config struct { | ||
Config string `json:"config"` | ||
ConfigJSON config.Config `json:configJSON` | ||
VersionINFO map[string]string `json:"versionInfo"` | ||
Uptime time.Time `json:"uptime"` | ||
} | ||
|
||
// Formatter needs to be implemented for each new output formatter | ||
type Formatter interface { | ||
SetOutput(io.Writer) | ||
FormatSilences([]types.Silence) error | ||
FormatAlerts(model.Alerts) error | ||
FormatConfig(Config) error | ||
} | ||
|
||
// Formatters is a map of cli argument name to formatter inferface object | ||
var Formatters map[string]Formatter = map[string]Formatter{} | ||
|
||
func FormatDate(input time.Time) string { | ||
dateformat := viper.GetString("date.format") | ||
return input.Format(dateformat) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cool idea, but it's introducing an additional way to parse labels that isn't supported in prometheus (and is equivalent to
node=~"(bar|baz)"
). I would personally vote for maintaining the same label matching syntax as prometheus.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key here is that creating multiple silences is awkward without being able to use shell expansions. I understand that it's not a big win for searching, but I think it is a huge help when creating silences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. @fabxc ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mh, how do you mean creating multiple silences? In the general case, there should be a single silence matching multiple labels if necessary rather than duplicated silences.
This comment is on alert querying, which is a different thing again.
I think I need a better example what's tried to be solved here.
In general, not growing representations for the same thing is preferable of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try and make this example clearer. I very well could be over complicating this.
Lets say I have an alert named
Http_Response_Slow
and an alert namedHttp_Response_5xx
and I have some number of boxes reporting the metrics that compose this alert with their name in anode
label.If I want to silence both alerts for a box reporting as
node=www1
andnode=www2
, but not for the rest of the boxes I would have to create the following four silences.amtool silence add alertname=Http_Response_Slow node=www1
amtool silence add alertname=Http_Response_Slow node=www2
amtool silence add alertname=Http_Response_5xx node=www1
amtool silence add alertname=Http_Response_5xx node=www2
With the ability to use the shell expansions I could make the same four silences with one command:
amtool silence add alertname={Http_Response_{Slow,5xx} node=www{1,2}
Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the confusion is why doing something like
and creating a single silence to cover all four alerts isn't preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you are saying. I guess it depends on the use case, with a large number of nodes specifying this regex will become very unwieldy. I would be willing to drop this if we don't think it should be part of this tool. I can always just make a bash for loop to create silences if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think right now we're leaning towards dropping this in favor of using a regex or having the user create their own e.g. bash loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand:
I think the team tends to favor conservatively introducing new features. They're 100% reasonable and if this ends up being something that people request it'll happen, but to start out keeping it simple is the preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable to me