-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compactor process is killed after some time. #5924
Comments
It looks like a networking issue to me...
Has your network setting changed since the beginning? |
Hello, @yeya24 I saw one of your comments regarding to network issue but the issue is not about it for example in my last compactor run I got this error when compactor crashed
Whenever it logs errors with these keys It gets crashed and I don't know why it happens ? |
Is it terminated because of resources? If you are running it in Kubernetes, did the pod get terminated somehow like OOM? |
while it was compacting, bucket size increased up to 200G then it dropped to 27G and it kept the size for a time but suddenly it crashes. No, It's running on server. |
After changing the S3 bucket provider, crashes terminated. |
Thanos, Prometheus and Golang version used:
Thanos version - thanos-0.29.0
Prometheus version - 2.19.2
go version: go1.14.4
Object Storage Provider:
S3 - minio
What happened:
Compactor was working over 24h hours, since I used --wait argument, it kept running but after sometime it crashes due to unexpected reason. When I run it again, it keeps working and again after sometime (varies) it crashes. I checked the block which exists in S3 and before this It deleted some blocks due to compaction/downsampling which proves there is no network issue.
What you expected to happen:
I am expecting to keep it running.
How to reproduce it (as minimally and precisely as possible):
I am testing it in my test host.
Full logs to relevant components:
Logs are same during its lifetime,
Anything else we need to know:
I used flag --block-viewer.global.sync-block-timeout=1h which has been contributed by @ianwoolf
#4764
The text was updated successfully, but these errors were encountered: