-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[amtool] - Add a new silence import
command
#1082
Conversation
cli/silence_import.go
Outdated
for _, silence := range silences { | ||
// reset the ID, otherwise alertmanager API will try to replace an existing silence | ||
// this *might* create duplicated silences but that should generally be OK | ||
silence.ID = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we wan't to avoid duplicate silences if possible. Does the alertmanager API accept posting a new silence with the ID field present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will accept if the silence ID exists and will replace it with the new one. But if the ID doesn't exist then it will return an error, not found. We can check and only remove ID after receive "not found" error but that will create a lot of extra round trip in the backup / restore use case (as every add will have to check and wait for a not found error first).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, don't think we should have duplicated silences as well. Maybe skip the import if silence already exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is for disaster recovery/data migration, I'm assuming maintaining the IDs wouldn't be important (and wouldn't expect there to be any pre-existing silences in the new instance of AM).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a flag to force
this, it is need as we are running Alertmanager 0.6 and thus cannot update an existing silences. I will keep the silence ID by default as it appear Alertmanager will only update existing one without creating any new silence in that case.
Hi @stuartnelson3, @Kellel can you guys help take a look and see if we can merge this PR? |
Hey sorry for the long delay, I've been away for the last few weeks and have been behind on reviewing things. I'll take a look at this in the next few days, thanks for your patience! |
silence import
command to amtoolsilence import
command to amtool
Sure @stuartnelson3 no problem, thank you for maintaining this. |
silence import
command to amtoolsilence import
command
cli/silence_import.go
Outdated
var err error | ||
|
||
if len(args) == 1 { | ||
input, err = os.Open(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to close the opened fd. defer input.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be added after the err
is checked below
cli/silence_import.go
Outdated
var err error | ||
|
||
if len(args) == 1 { | ||
input, err = os.Open(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be added after the err
is checked below
cli/silence_import.go
Outdated
} | ||
} | ||
|
||
data, err := ioutil.ReadAll(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ioutil.ReadAll
will load the entire input into memory. It would be better to process the information as a stream.
See https://golang.org/pkg/encoding/json/#example_Decoder_Decode_stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it ended up being a super easy change:
dec := json.NewDecoder(input)
t, err := dec.Token()
if err != nil {
return err
}
for dec.More() {
var s types.Silence
err := dec.Decode(&s)
if err != nil {
log.Fatal(err)
}
s.ID = ""
err = addSilence(s)
if err != nil {
msg := fmt.Sprintf("couldn't add silence: %s", s)
return errors.Wrap(err, msg)
}
}
t, err = dec.Token()
if err != nil {
return err
}
return nil
cli/silence_import.go
Outdated
// this *might* create duplicated silences but that should generally be OK | ||
silence.ID = "" | ||
|
||
err = addSilence(silence) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is being done in serial, it will be relatively slow. We can always create a small worker pool (https://gobyexample.com/worker-pools) without much trouble. I'm fine adding it later, but if you want to add it now, feel free.
cli/silence_import.go
Outdated
} | ||
|
||
func bulkImport(cmd *cobra.Command, args []string) error { | ||
input := os.Stdin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also accepting a json stream on stdin is 👌
Sorry for the close / re-open as I pushed the wrong commit 😞 There is also a new flag |
This command read silences data from a query JSON output and import to alertmanager. It allows `amtool` to be used as a backup/restore tool for silences, i.e. prometheus#1000 Backup / export: ``` amtool silence -o json > silences.json ``` Restore / import: ``` amtool silence import silences.json ```
Probably worth mention that the
The end results (with or without |
Awesome work! I'll give this a try tomorrow morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the worker pool! Unfortunately, the way it's currently constructed, it will deadlock if you try to import more than 100 silences. The error channel isn't drained until AFTER all silences have been created, so the send to errc
in addSilenceWorker
will block.
I've added some code comments inline which fixes the problem.
cli/silence_import.go
Outdated
} | ||
|
||
silences := make(chan *types.Silence, 100) | ||
errs := make(chan error, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename this to errc
?
cli/silence_import.go
Outdated
return errors.Wrap(err, "couldn't unmarshal input data, is it JSON?") | ||
} | ||
|
||
silences := make(chan *types.Silence, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go convention for naming channels is typically to name something and then put c
after it (to indicate channel). could you change this to silencec
?
cli/silence_import.go
Outdated
} | ||
for w := 0; w < workers; w++ { | ||
go addSilenceWorker(silences, errs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three lines can be updated to help prevent a deadlock:
var wg sync.WaitGroup
for w := 0; w < workers; w++ {
go func(w int) {
wg.Add(1)
addSilenceWorker(silc, errc)
wg.Done()
}(w)
}
go func() {
for err := range errc {
if err != nil {
errCount++
}
}
}()
Check out down below where we wg.Wait()
, which indicates that all the silence workers are finished, and then we can close(errc)
.
cli/silence_import.go
Outdated
silences <- &s | ||
count++ | ||
} | ||
close(silences) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final piece is here:
close(silencec)
wg.Wait()
close(errc)
once we're done sending all the parsed silences, we can close the channel we're ranging over (anything still in the channel waiting to be processed will be processed). from there, we wg.Wait()
to know all the silences have been created and their errors sent on the error channel, and then we can close(errc)
.
cli/silence_import.go
Outdated
if err != nil { | ||
errCount++ | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code should be removed, it has been placed into the goroutine further up.
Move error channel reading to a goroutine to prevent deadlock and thus add a WaitGroup to synchronize.
Thanks @stuartnelson3 these feedbacks are very helpful as I only started with Go recently. I've updated the PR as suggested. |
cli/silence_import.go
Outdated
// read closing bracket | ||
_, err = dec.Token() | ||
if err != nil { | ||
return errors.Wrap(err, "invalid JSON") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this as I think it doesn't really make sense to throw an error over JSON format here, the silences are added already at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually prefer to keep it in. Even though all the silences have been added, a user should still be informed if they are attempting to use invalid json.
I'll take a look at this tomorrow to confirm it's working, thanks for the work on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, works awesome! Took 3sec to insert 10,000 silences (and half that with --force
)
Great! Thanks @stuartnelson3 |
This command read silences data from a query JSON output and import to
alertmanager. It allows
amtool
to be used as a backup/restore tool forsilences, i.e. fix #1000
Backup / export:
Restore / import:
This adds new silence one by one and will be slow if you have thousands of silences but alertmanager API doesn't allow multiple silences in POST so ¯\(ツ)/¯