Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amtool implementation #636

Merged
merged 48 commits into from
Apr 20, 2017
Merged

Amtool implementation #636

merged 48 commits into from
Apr 20, 2017

Conversation

Kellel
Copy link
Contributor

@Kellel Kellel commented Feb 25, 2017

As discussed in #567

Kellen Fox and others added 25 commits December 31, 2016 15:58
The primary goal of an alertmanager tool is to provide a cli interface
for the prometheus alertmanager.

My vision for this tool has two parts:
  - Silence management (query, add, delete)

  - Alert management (query, maybe more in future?)

Resolves: prometheus#567
Error messages are maybe better?
 * also allow for man page generation
 * update vendors to include cobra/doc
 * split silence commands into multiple files
 * move some ommon gits into utils.go
Amtool initial implementation
@Kellel
Copy link
Contributor Author

Kellel commented Feb 25, 2017

Oops. Looks like I have some vendoring issues

@beorn7
Copy link
Member

beorn7 commented Feb 27, 2017

Looks quite cool. But also very involved.

Before we dive into code-level review, it would be great if @fabxc and/or @brancz could comment if this is fundamentally sane and meeting their expectations. @stuartnelson3 might also have an opinion as he is getting into the AM API quite a bit recently.

cli/utils.go Outdated
}

// Parse a list of labels (cli arguments)
func parseMatchers(labels []string, resolve int) (types.Matchers, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is re-implementing functionality that already exists in PromQL, which seems like another vote for moving PromQL to a standalone package as mentioned in this PR #633

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I looked at this, but it didn't immediately strike me as a fitting what I was trying to do. I'll have to take another look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my hope to #633 merged soon, so then the tool could accept a promql label matcher string ({foo="bar", baz=~"qu.*"}) and pass that on to the api.

this would be changing your api, however, so it's worth a discussion, and looping in @fabxc to see what he thinks. It looks like this might be expanding into generating a swagger api. Let's drop the discussion down to the main flow of the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other issue I had was that I needed to filter silences. Will that be covered in #633 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering silences is available in that pr, in 0954959

I'm going to try to get that PR pushed through, then it should just be a few easy changes here and we can get this merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Looking at that.

Copy link
Contributor Author

@Kellel Kellel Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question.
I implemented a feature where if a user doing a query (silence or alert) does something like the following:
amtool alert query Host

With no matcher name specified then I attempt to match the alertname using a prefix match. Is this something we want to support? Moving to promql should do everything I want, but i'm curious if I should spend the time to re-implement this functionality?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, @fabxc ?


buf := bytes.NewBuffer([]byte{})
enc := json.NewEncoder(buf)
err = enc.Encode(silence)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json.NewEncoder(buf).Encode(silence)

The encode and decode stuff doesn't get re-used, so no need to assign it to a var.

decoder := json.NewDecoder(res.Body)

response := addResponse{}
err = decoder.Decode(&response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

response := addResponse{}
err = decoder.Decode(&response)
if err != nil {
return errors.New(fmt.Sprintf("Unable to parse silence json response from %s", u.String()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors.New(fmt.Sprintf("..", thing1, thing2)) == fmt.Errorf("..", thing1, thing2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

@Kellel
Copy link
Contributor Author

Kellel commented Apr 3, 2017

Sorry for not having an update in quite a few days. Been busy with other things at work.

Kellen Fox added 2 commits April 6, 2017 11:30
 * refactor alert fetch to pull filters
 * discard all multiple query logic in favor of simpler regex
 * discard all matching logic
 * cleanup a number of small things that made the code odd
 * relax the regex in pkg/parse/parse.go to allow no quote characters
@stuartnelson3
Copy link
Contributor

stuartnelson3 commented Apr 7, 2017

Running the top level amtool alert is returning all alerts, but if I try a regex matcher I get an error:

# ./amtool alert query 'alertname=~MySQL.*'
Error: [bad_data] bad matcher format
Usage:
  amtool alert [flags]

Flags:
      --expired    Show expired alerts as well as active
  -s, --silenced   Show silenced alerts

Global Flags:
      --alertmanager.url string   Alertmanager to talk to
      --config string             config file (default is $HOME/.amtool.yml)
  -o, --output string             Output formatter (simple, extended, json) (default "simple")
  -v, --verbose                   Verbose running information

[bad_data] bad matcher format

The documentation seems to say this is valid though:

amtool alert query 'alertname=~foo.*'

Actually, it seems like every query I'm running is resulting in a bad matcher error?

# ./amtool alert query alertname=NodeHasMemoryErrors
Error: [bad_data] bad matcher format

Same deal with silences: Showing all silences works, but when I try to query for them it says bad matcher.

@Kellel
Copy link
Contributor Author

Kellel commented Apr 7, 2017

Looks like I forgot to register the query alias for the alert command. 🤣

Silence seems to be working in my testing environment. Did you rebuild your alertmanager and docker container?

@stuartnelson3
Copy link
Contributor

Ahhhh yeah I didn't rebuild the actual alertmanager binary :)

This is super cool!

Querying seems to be working, except for the bare alertname case:

# ./amtool --alertmanager.url $url alert query alertname=K8sCPURequestTooLow
Alertname            Starts At                Summary                 
K8sCPURequestTooLow  2017-04-10 06:27:06 UTC  Pod CPU use is too low 

# ./amtool --alertmanager.url $url alert query K8sCPURequestTooLow
Error: [bad_data] bad matcher format
Usage:
  amtool alert query [flags]

Flags:
      --expired    Show expired alerts as well as active
  -s, --silenced   Show silenced alerts

Global Flags:
      --alertmanager.url string   Alertmanager to talk to
      --config string             config file (default is $HOME/.amtool.yml)
  -o, --output string             Output formatter (simple, extended, json) (default "simple")
  -v, --verbose                   Verbose running information

[bad_data] bad matcher format

Other than that this looks pretty cool to me.

One thing that would be nice is also filtering by receiver as a sort of special key-value pair, but I think we can wait on that because it's also a desired feature in the UI, so doing that via a query param in the API is probably what will happen.

@Kellel
Copy link
Contributor Author

Kellel commented Apr 10, 2017

Oh I don't think I re-implemented the bare alertname functionality now that I'm not parsing any of the arguments.

Should be reasonably simple to re-add

@stuartnelson3
Copy link
Contributor

seems to also be an error when using the config subcommand:

# ./amtool config -v
Using config file: /Users/stuartnelson/.amtool.yml
Error: json: cannot unmarshal string into Go struct field Route.match_re of type config.Regexp

expiring an already expired silence:

# ./amtool silence expire a6c18c6b-9cb9-4826-a657-f054bef0a121
Error: end time must not be before start time
Usage:
  amtool silence expire [flags]

Global Flags:
      --alertmanager.url string   Alertmanager to talk to
      --config string             config file (default is $HOME/.amtool.yml)
  -o, --output string             Output formatter (simple, extended, json) (default "simple")
  -v, --verbose                   Verbose running information

end time must not be before start time

it also seems that any error output is duplicated? I noticed this before but forgot to mention it.

One idea that I had while interacting with this:

It's awesome that I can e.g.:

# ./amtool silence expire $uuid1 $uuid2 $uuid3

But this requires me to do some bashery to grab the IDs:

# ./amtool silence expire (./amtool silence query 'alertname=~TestAmTool.*' | awk 'NR>1{print $1}')

Taking a cue from the docker cli, what do you think about adding a -q flag to show just IDs? So this would then collapse into

# ./amtool silence query foo=bar -q
c0c6cb44-cac5-4c26-a797-a7f688009890
c776bc72-7d88-438b-bed7-0485d1ab975b
8fa7a0e6-5723-4c9f-892e-8b8d99a93eb3
1795274c-555e-46dd-b861-da5da459bc22
76441da9-f583-4188-a7f7-e674f24092e0
6c4f4425-62f3-487a-bd83-92462662895b
d31b41fe-c1da-47d7-8dba-2b33438b87ef
e91409e4-e356-43c7-bd2f-422234d5b15a
82af6174-f098-47a9-b4e3-c6700b1ff5e7
0ce9c4c3-5001-4cf2-8d7d-67702a02a0d9
a368b6ce-5e27-4191-a315-18c649e3cce5

Seems to be -q and --quiet. What do you think about this?

@stuartnelson3
Copy link
Contributor

There's also an inconsistency between the UI and the cli tool when making a silence with a regex:

# ./amtool silence add 'alertname=~TestAmTool.*' --comment='test out creation'
dfbacf0a-a4ed-4bf2-80d5-26dad57f8b65

The created regex ends up being set as the name:
image

But that's only supposed to happen internally when creating the regex. Creating the same thing through the ui results in alertname=~"TestAmTool.*" without the starting and ending anchors.

@Kellel
Copy link
Contributor Author

Kellel commented Apr 11, 2017

I think the -q --quiet would be a great addition.

After a little bit of debugging it seems that the "(?:TestAmTool.*)$" bug seems to be coming directly from the way that pkg/parse.go works I was hoping to use that code to do the parsing of matchers.

@@ -46,8 +46,7 @@ func NewMatcher(t MatchType, n, v string) (*Matcher, error) {
Value: v,
}
if t == MatchRegexp || t == MatchNotRegexp {
m.Value = "^(?:" + v + ")$"
re, err := regexp.Compile(m.Value)
re, err := regexp.Compile("^(?:" + m.Value + ")$")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the cause of the regex information showing up in the ui since I'm using the internal parser to parse silence matchers.

Is this change acceptable? All tests are still passing

Copy link
Contributor

@stuartnelson3 stuartnelson3 Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to diverge from vendored packages.

I went through the code and made a couple changes, but I'll leave it to you to apply them to your branch.

First, reverting the change to the vendored package:

git revert 5d0bba4b3b3e682b4eae10381ce96660d5c3979a

Then here's a patch file that you can apply. Read through it to see what it does, but basically it separates out the actual parsing code in pkg/parse so that there's an exported function for parsing the raw input string, which the CLI then uses to send to the API.

parse_regex.txt

Note: Apparently github won't let me upload a .patch file, so it says text, but you should be able to apply it all the same.

Edit:
In case you have to google it (which I always do 🤷‍♂️ ), I'll save you the 30sec:

git apply parse_regex.patch

@Kellel
Copy link
Contributor Author

Kellel commented Apr 17, 2017

What are our next steps to get this merged?

@stuartnelson3
Copy link
Contributor

There are a few things that need to be fixed:

  1. Config output is failing to marshal my config:
# ./amtool config -v
Using config file: /Users/stuartnelson/.amtool.yml
Error: json: cannot unmarshal string into Go struct field Route.match_re of type config.Regexp
Usage:
  amtool config [flags]

Global Flags:
      --alertmanager.url string   Alertmanager to talk to
      --config string             config file (default is $HOME/.amtool.yml)
  -o, --output string             Output formatter (simple, extended, json) (default "simple")
  -v, --verbose                   Verbose running information

json: cannot unmarshal string into Go struct field Route.match_re of type config.Regexp
# cat ~/.amtool.yml 
alertmanager.url: http://redacted.net:9093
author: stn@soundcloud.com
comment_required: true
output: simple
  1. Output errors are printed twice (see above)
  2. Can you move the --quiet flag to the top-level silence subcommand? So regardless of if I'm querying or looking at all silences, this will output the ids (currently it's just working for query)

After that this should be ready to merge. Thanks for you patience!

@stuartnelson3
Copy link
Contributor

Awesome. I'm going to go ahead and merge this because it's been so long in a branch. There's an issue I would like to address still, but I'll put that in issue #721

@stuartnelson3 stuartnelson3 merged commit 3aab66e into prometheus:master Apr 20, 2017
@mjtrangoni mjtrangoni mentioned this pull request Apr 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants