Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amtool implementation #636

Merged
merged 48 commits into from
Apr 20, 2017
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
3ea3e8a
Implement alertmanager cli tool 'amtool'
Dec 31, 2016
cd0d337
Update gitignore
Jan 4, 2017
7578ffc
Implement config subcommand
Jan 6, 2017
f33bac9
Move output flag to root command. Also print config notification to s…
Jan 6, 2017
b2ab06b
Implement alert command
Jan 6, 2017
be80a7e
Revamp silence command
Jan 6, 2017
02c4ac2
Flesh out format stuff
Jan 6, 2017
ba742c8
Fix config format for extended output
Jan 6, 2017
e61d360
Allow alername default when no = specified
Jan 6, 2017
605c403
Vendor in cobra, viper, and pflag
Jan 7, 2017
fbd99d9
Fix argument ordering
Jan 9, 2017
0d534c2
added more comments:
Jan 9, 2017
89869e7
Remove unneeded copyright block
Jan 23, 2017
a0bab45
Fix capital A in Alertmanager
Jan 23, 2017
1d8dbf0
Exit 1 instead of -1
Jan 23, 2017
513312f
Remove un-used function
Jan 23, 2017
8915d2f
Fixup tests to simplified types
Jan 23, 2017
a22ac3d
Remove unneeded comment from code generation
Jan 23, 2017
d9413d0
Make alertmanager url flag consistent with alertmanager cli syntax
Jan 23, 2017
d1dec7e
Made the url validation better
Jan 23, 2017
360a631
Fixups from 1:1 session
Jan 30, 2017
2199e68
Vastly improve help commands
Feb 24, 2017
8f614b0
Add docs about config file
Feb 25, 2017
8e8f364
Add docs about how to generate man pages and bash completions
Feb 25, 2017
597a362
Merge pull request #1 from Kellel/amtool
Kellel Feb 25, 2017
c7e603a
Fix my vendors
Feb 25, 2017
f5478c6
Re-order things in tests so that they pass (things are sorted now)
Feb 25, 2017
e523382
Fix ordering of regex string formatter
Mar 3, 2017
f536f83
Fix fmt.Sprintf formatting to multiple lines so that it's easier to read
Mar 3, 2017
eaf6249
Refactor config file loading
Mar 3, 2017
fbdc789
Update docs for config file
Mar 3, 2017
0aa1b09
Small fixes to config system
Mar 6, 2017
7326c32
fixup name of VersionInfo
Mar 7, 2017
576e643
Merge changes from master
Mar 16, 2017
70316d8
Allow export of regex from label parser
Mar 21, 2017
fba0793
Merge in changes related to parser code
Mar 21, 2017
32a8973
Major refactor to use filters in #633
Apr 6, 2017
aa47c41
Add missing dependancy
Apr 6, 2017
e3caaa8
Add alert query alias
Apr 7, 2017
7d6a860
Re-implement `alertname=` prefix when the first argument isn't a matcher
Apr 11, 2017
5d0bba4
Fix bug with extra regex information leaking from cli created silences
Apr 11, 2017
ac50b07
Implement `-q` flag for only silence id output
Apr 11, 2017
507b32b
Revert "Fix bug with extra regex information leaking from cli created…
Apr 12, 2017
dabbd06
Revert change to vendored repo and apply parsing change patch
Apr 12, 2017
7f3ce6d
Make `-q` `--quiet` flags a silence global command
Apr 19, 2017
0da7098
Fixup config handling since things changed in alertmanager since initail
Apr 19, 2017
ffb6747
Remove double error notifications
Apr 19, 2017
e56573b
Merge branch 'master' into master
Kellel Apr 19, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/data/
/alertmanager
/amtool
*.yml
*.yaml
/.build
Expand Down
2 changes: 2 additions & 0 deletions .promu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ build:
binaries:
- name: alertmanager
path: ./cmd/alertmanager
- name: amtool
path: ./cmd/amtool
flags: -a -tags netgo
ldflags: |
-X {{repoPath}}/vendor/github.com/prometheus/common/version.Version={{.Version}}
Expand Down
18 changes: 18 additions & 0 deletions artifacts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Generating amtool artifacts

Amtool comes with the option to create a number of eaze-of-use artifacts that can be created.

go run generate_amtool_artifacts.go

## Bash completion

The bash completion file can be added to `/etc/bash_completion.d/`.

## Man pages

Man pages can be added to the man directory of your choice

cp artifacts/*.1 /usr/local/share/man/man1/
sudo mandb

Then you should be able to view the man pages as expected.
17 changes: 17 additions & 0 deletions artifacts/generate_amtool_artifacts.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package main

import (
"github.com/spf13/cobra/doc"

"github.com/prometheus/alertmanager/cli"
)

func main() {
cli.RootCmd.GenBashCompletionFile("amtool_completion.sh")
header := &doc.GenManHeader{
Title: "amtool",
Section: "1",
}

doc.GenManTree(cli.RootCmd, header, ".")
}
180 changes: 180 additions & 0 deletions cli/alert.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
package cli

import (
"encoding/json"
"errors"
"net/http"
"path"
"time"

"github.com/prometheus/alertmanager/cli/format"
"github.com/prometheus/alertmanager/types"
"github.com/prometheus/common/model"
"github.com/spf13/cobra"
flag "github.com/spf13/pflag"
"github.com/spf13/viper"
)

type alertmanagerAlertResponse struct {
Status string `json:"status"`
Data model.Alerts `json:"data,omitempty"`
ErrorType string `json:"errorType,omitempty"`
Error string `json:"error,omitempty"`
}

var alertFlags *flag.FlagSet

// alertCmd represents the alert command
var alertCmd = &cobra.Command{
Use: "alert",
Short: "View and search through current alerts",
Long: `View and search through current alerts.

Amtool has a simplified prometheus query syntax, but contains robust support for
bash variable expansions. The non-option section of arguments constructs a list
of "Matcher Groups" that will be used to filter your query. The following
examples will attempt to show this behaviour in action:

amtool alert query alertname=foo node=bar

This query will match all alerts with the alertname=foo and node=bar label
value pairs set.

amtool alert query foo node=bar

If alertname is ommited and the first argument does not contain a '=' or a
'=~' then it will be assumed to be the value of the alertname pair.

amtool alert query 'alertname=~foo.*'

As well as direct equality, regex matching is also supported. The '=~' syntax
(similar to prometheus) is used to represent a regex match. Regex matching
can be used in combination with a direct match.

amtool alert query alertname=foo node={bar,baz}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cool idea, but it's introducing an additional way to parse labels that isn't supported in prometheus (and is equivalent to node=~"(bar|baz)"). I would personally vote for maintaining the same label matching syntax as prometheus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key here is that creating multiple silences is awkward without being able to use shell expansions. I understand that it's not a big win for searching, but I think it is a huge help when creating silences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. @fabxc ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh, how do you mean creating multiple silences? In the general case, there should be a single silence matching multiple labels if necessary rather than duplicated silences.

This comment is on alert querying, which is a different thing again.

I think I need a better example what's tried to be solved here.

In general, not growing representations for the same thing is preferable of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try and make this example clearer. I very well could be over complicating this.

Lets say I have an alert named Http_Response_Slow and an alert named Http_Response_5xx and I have some number of boxes reporting the metrics that compose this alert with their name in a node label.

If I want to silence both alerts for a box reporting as node=www1 and node=www2, but not for the rest of the boxes I would have to create the following four silences.

amtool silence add alertname=Http_Response_Slow node=www1
amtool silence add alertname=Http_Response_Slow node=www2
amtool silence add alertname=Http_Response_5xx node=www1
amtool silence add alertname=Http_Response_5xx node=www2

With the ability to use the shell expansions I could make the same four silences with one command:

amtool silence add alertname={Http_Response_{Slow,5xx} node=www{1,2}

Does this make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the confusion is why doing something like

amtool silence add alertname=~"Http_Response_.*" node=~"www(1|2)"

and creating a single silence to cover all four alerts isn't preferable.

Copy link
Contributor Author

@Kellel Kellel Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying. I guess it depends on the use case, with a large number of nodes specifying this regex will become very unwieldy. I would be willing to drop this if we don't think it should be part of this tool. I can always just make a bash for loop to create silences if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think right now we're leaning towards dropping this in favor of using a regex or having the user create their own e.g. bash loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand:

I think the team tends to favor conservatively introducing new features. They're 100% reasonable and if this ends up being something that people request it'll happen, but to start out keeping it simple is the preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me


This query will match all alerts with the alertname=foo label value pair
and EITHER node=bar or node=baz.

amtool alert query alertname=foo{a,b} node={bar,baz}

Similar to the previous example this query will match all alerts with any
combination of alertname=fooa or alertname=foob AND node=bar or node=baz.

`,
RunE: queryAlerts,
}

func init() {
RootCmd.AddCommand(alertCmd)
alertCmd.Flags().Bool("expired", false, "Show expired alerts as well as active")
alertCmd.Flags().BoolP("silenced", "s", false, "Show silenced alerts")
alertFlags = alertCmd.Flags()
}

func fetchAlerts() (model.Alerts, error) {
alertResponse := alertmanagerAlertResponse{}

u, err := GetAlertmanagerURL()
if err != nil {
return model.Alerts{}, err
}

u.Path = path.Join(u.Path, "/api/v1/alerts")
res, err := http.Get(u.String())
if err != nil {
return model.Alerts{}, err
}

defer res.Body.Close()
decoder := json.NewDecoder(res.Body)

err = decoder.Decode(&alertResponse)
if err != nil {
return model.Alerts{}, errors.New("Unable to decode json response")
}
return alertResponse.Data, nil
}

func queryAlerts(cmd *cobra.Command, args []string) error {
alerts, err := fetchAlerts()
if err != nil {
return err
}

silences, err := fetchSilences()
if err != nil {
return err
}

var groups []types.Matchers
if len(args) > 0 {
matchers, err := parseMatchers(args, RESOLVE_FUZZY)
if err != nil {
return err
}
groups = parseMatcherGroups(matchers)
if err != nil {
return err
}
}

expired, err := alertFlags.GetBool("expired")
if err != nil {
return err
}

showSilenced, err := alertFlags.GetBool("silenced")
if err != nil {
return err
}

displayAlerts := model.Alerts{}
for _, alert := range alerts {
// If we are only returning current alerts and this one has already expired skip it
if !expired {
if !alert.EndsAt.IsZero() && alert.EndsAt.Before(time.Now()) {
continue
}
}

if !showSilenced {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This filtering is something I've also implemented in the new alertmanager-ui. Seems like we should add it as a query string param to the url and do the filtering on the initial query, instead of duplicating our work in different consumers of the api?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I kinda felt like it was a pain doing all this by hand in the cli tool

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality is already available here: https://github.com/prometheus/alertmanager/blob/master/dispatch/dispatch.go#L132-L134

All that needs to be done is parsing a ?silenced=true to show silenced alerts, and by default don't return them in the response.

// If any silence mutes this alert don't show it
silenced := false
for _, silence := range silences {
// Need to call Init before Mutes
err = silence.Init()
if err != nil {
return err
}

if silence.Mutes(alert.Labels) {
silenced = true
break
}
}
if silenced {
continue
}
}

// If the user hasn't specified and match groups then let it through
if len(groups) < 1 {
displayAlerts = append(displayAlerts, alert)
continue
}

for _, matchers := range groups {
if matchers.Match(alert.Labels) {
displayAlerts = append(displayAlerts, alert)
break
}
}
}

formatter, found := format.Formatters[viper.GetString("output")]
if !found {
return errors.New("Unknown output formatter")
}
return formatter.FormatAlerts(displayAlerts)
}
86 changes: 86 additions & 0 deletions cli/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
package cli

import (
"encoding/json"
"errors"
"net/http"
"path"
"time"

"github.com/prometheus/alertmanager/cli/format"
"github.com/prometheus/alertmanager/config"
"github.com/spf13/cobra"
//flag "github.com/spf13/pflag"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented out import

"github.com/spf13/viper"
)

// Config is the response type of alertmanager config endpoint
// Duped in cli/format needs to be moved to common/model
type Config struct {
Config string `json:"config"`
ConfigJSON config.Config `json:configJSON`
VersionINFO map[string]string `json:"versionInfo"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the go naming convention for non-acronyms would have this be VersionInfo

Uptime time.Time `json:"uptime"`
}

type alertmanagerStatusResponse struct {
Status string `json:"status"`
Data Config `json:"data,omitempty"`
ErrorType string `json:"errorType,omitempty"`
Error string `json:"error,omitempty"`
}

// alertCmd represents the alert command
var configCmd = &cobra.Command{
Use: "config",
Short: "View the running config",
Long: `View current config

The amount of output is controlled by the output selection flag:
- Simple: Print just the running config
- Extended: Print the running config as well as uptime and all version info
- Json: Print entire config object as json`,
RunE: queryConfig,
}

func init() {
RootCmd.AddCommand(configCmd)
}

func fetchConfig() (Config, error) {
configResponse := alertmanagerStatusResponse{}
u, err := GetAlertmanagerURL()
if err != nil {
return Config{}, err
}

u.Path = path.Join(u.Path, "/api/v1/status")
res, err := http.Get(u.String())
if err != nil {
return Config{}, err
}

defer res.Body.Close()
decoder := json.NewDecoder(res.Body)

err = decoder.Decode(&configResponse)
if err != nil {
return Config{}, err
}

return configResponse.Data, nil
}

func queryConfig(cmd *cobra.Command, args []string) error {
config, err := fetchConfig()
if err != nil {
return err
}

formatter, found := format.Formatters[viper.GetString("output")]
if !found {
return errors.New("Unknown output formatter")
}

return formatter.FormatConfig(format.Config(config))
}
38 changes: 38 additions & 0 deletions cli/format/format.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package format

import (
"io"
"time"

"github.com/prometheus/alertmanager/config"
"github.com/prometheus/alertmanager/types"
"github.com/prometheus/common/model"
"github.com/spf13/viper"
)

const DefaultDateFormat = "2006-01-02 15:04:05 MST"

// Config representation
// Need to get this moved to the prometheus/common/model repo having is duplicated here is smelly
type Config struct {
Config string `json:"config"`
ConfigJSON config.Config `json:configJSON`
VersionINFO map[string]string `json:"versionInfo"`
Uptime time.Time `json:"uptime"`
}

// Formatter needs to be implemented for each new output formatter
type Formatter interface {
SetOutput(io.Writer)
FormatSilences([]types.Silence) error
FormatAlerts(model.Alerts) error
FormatConfig(Config) error
}

// Formatters is a map of cli argument name to formatter inferface object
var Formatters map[string]Formatter = map[string]Formatter{}

func FormatDate(input time.Time) string {
dateformat := viper.GetString("date.format")
return input.Format(dateformat)
}
Loading