-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The plugin does not retry on 502
errors from the server
#67
Comments
I meant re-transmission... |
According to the docs, the plugin retries after a number of seconds, as described in the Digging deeper, I found this line of code:
which would mean the plugin retries on 429 , 503 and 504 errors from server, but not on 502 errors that you have been hitting.
Honestly I don't see why the plugin shouldn't retry on 502 as well, what do you think? |
I'm going back and forth with Sumo Logic support... [2021-09-15T08:26:03,373][INFO ][logstash.outputs.elasticsearch][elastiflow_out] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [<transport_request>] would be [4058089810/3.7gb], which is larger than the limit of [4047097036/3.7gb], real usage: [4057899904/3.7gb], new bytes reserved: [189906/185.4kb], usages [request=72/72b, fielddata=11489703/10.9mb, in_flight_requests=764276/746.3kb, accounting=108807531/103.7mb]", "bytes_wanted"=>4058089810, "bytes_limit"=>4047097036, "durability"=>"PERMANENT"}) |
502
errors from the server
Hey @im-dim, I've just released new version v1.4.0, which retries on I've reached out internally in Sumo to the team responsible for log ingestion about the 502 errors. They replied they monitor 502 errors and their alerting system wasn't triggered, which I think means the problem was intermittent. In that case the retry logic should help here. |
Thank you Andrzej. We'll try the new version. Would there be a specific log message if retry fails? In other words, we'd like to know how much data we are missing... About intermittent... In average, we get between 10,000 and 20,000 502 codes per day meaning that we miss thousands of records so I'm not sure if that can be called intermittent... |
Looking at the code, it looks like the plugin retries infinitely, so there is no such thing as a failed retry. A failed retry will result in another retry, and so on. Thanks for the numbers on |
If there is no configurable limit on re-try, then there is a chance of memory consumption indefinite growth in cases where Sumo server has a problem and responds with 502. Right? Can re-try count be added with a separate error "X re-tries failed" or something like that? |
I haven't run any tests, but again - looking at the code :) - there's a Ruby class SizedQueue used internally by the plugin, and it is size-capped, meaning that when the queue fills up, it stops accepting new data, making the enqueue operations to be blocked until space is freed (see docs for the push method). |
Thank you for the reply. |
Does plugin do transmission on failure?
The text was updated successfully, but these errors were encountered: