Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert missing for bucket operation failures on the sidecar #3005

Closed
SerialVelocity opened this issue Aug 8, 2020 · 7 comments · Fixed by #3567
Closed

Alert missing for bucket operation failures on the sidecar #3005

SerialVelocity opened this issue Aug 8, 2020 · 7 comments · Fixed by #3567

Comments

@SerialVelocity
Copy link

Thanos, Prometheus and Golang version used:
quay.io/prometheus/prometheus:v2.20.0
quay.io/thanos/thanos:v0.14.0

Object Storage Provider:
Scaleway S3 compatible object store

What happened:
The sidecar was failing to upload to the S3 (my fault) but no alerts were triggered.

What you expected to happen:
An alert to be triggered.

How to reproduce it (as minimally and precisely as possible):
Set up a network policy that blocks uploading

Full logs to relevant components:

level=warn ts=2020-08-08T11:50:28.85471248Z caller=sidecar.go:285 err="check exists: stat s3 object: Get \"https://s3.nl-ams.scw.cloud/<bucket-name>/?location=<location>": dial tcp 163.172.208.8:443: connect: connection refused" uploaded=0

Anything else we need to know:
It looks like this used to work by having an alert on thanos_objstore_bucket_operation_failures_total. I actually found the issue when I looked at the thanos sidecar dashboard.

It looks like #2002 removed the alert. I think only ones with _s3_/_gcs_ in the name were meant to be removed?

@stale
Copy link

stale bot commented Sep 8, 2020

Hello 👋 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Sep 8, 2020
@SerialVelocity
Copy link
Author

I haven't tried 0.15.0 but I don't see anything in the changelog

@stale stale bot removed the stale label Sep 8, 2020
@brokencode64
Copy link

Seeing this issue too, is it intentional?

@kakkoyun
Copy link
Member

kakkoyun commented Sep 9, 2020

@brokencode64 I don't think it is intentional. We can always ask @daixiang0, if the removal was intentional. @daixiang0 WDYT?

@SerialVelocity In any case, if you think it is an alert that's needed or we are missing any other alerts, please feel free to contribute. I'd be happy to review and include it in the next release.

@daixiang0
Copy link
Member

It is not intentional but a mistake, @SerialVelocity thanks for catching it!

@stale
Copy link

stale bot commented Nov 23, 2020

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Nov 23, 2020
@kakkoyun kakkoyun removed the stale label Nov 24, 2020
@kakkoyun
Copy link
Member

Still valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants