-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] Instrument aws-s3 with metrics #25711
[Filebeat] Instrument aws-s3 with metrics #25711
Conversation
6b4406e
to
d62af13
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
55d5c0e
to
ae9ecf1
Compare
Pinging @elastic/integrations (Team:Integrations) |
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
This pull request is now in conflicts. Could you fix it? 🙏
|
ae9ecf1
to
b2e22a8
Compare
Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added. - Number of SQS messages received (not necessarily processed fully). - Number of SQS visibility timeout extensions. - Number of SQS messages inflight (gauge). - Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). - Number of SQS messages deleted. - Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). - Number of S3 objects downloaded. - Number of S3 bytes processed. - Number of events created from processing S3 data. - Number of S3 objects inflight (gauge). - Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). The metrics are structured as: dataset.<input-id>: id=<input id> input=aws-s3 sqs_messages_received_total sqs_visibility_timeout_extensions_total sqs_messages_inflight_gauge sqs_messages_returned_total sqs_messages_deleted_total sqs_message_processing_time.histogram s3_objects_requested_total s3_bytes_processed_total s3_events_created_total s3_objects_inflight_gauge s3_object_processing_time.histogram The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit `id` is not set in the input config
b2e22a8
to
c51e2ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome.
What do you think about adding a histogram for forwardEvent
?
I was thinking it might be helpful to see if increases in s3_object_processing_time
are because downloading S3 object is slow or because forwardEvent
is slow.
Initially I had only timed the body processing side, but then went back to including full request time because I figured the value correlated more closely to I could easily time the download, but one thing I'm not entirely sure of is whether the download is complete when We can discuss this a little more and add it separately. I'm going to fix a few config issues and I also found a deadlock issue. |
* Instrument aws-s3 with metrics Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added. - Number of SQS messages received (not necessarily processed fully). - Number of SQS visibility timeout extensions. - Number of SQS messages inflight (gauge). - Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). - Number of SQS messages deleted. - Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). - Number of S3 objects downloaded. - Number of S3 bytes processed. - Number of events created from processing S3 data. - Number of S3 objects inflight (gauge). - Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). The metrics are structured as: dataset.<input-id>: id=<input id> input=aws-s3 sqs_messages_received_total sqs_visibility_timeout_extensions_total sqs_messages_inflight_gauge sqs_messages_returned_total sqs_messages_deleted_total sqs_message_processing_time.histogram s3_objects_requested_total s3_bytes_processed_total s3_events_created_total s3_objects_inflight_gauge s3_object_processing_time.histogram The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit `id` is not set in the input config. (cherry picked from commit d3a03b0) # Conflicts: # x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc # x-pack/filebeat/input/awss3/collector.go
* Instrument aws-s3 with metrics Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added. - Number of SQS messages received (not necessarily processed fully). - Number of SQS visibility timeout extensions. - Number of SQS messages inflight (gauge). - Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). - Number of SQS messages deleted. - Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). - Number of S3 objects downloaded. - Number of S3 bytes processed. - Number of events created from processing S3 data. - Number of S3 objects inflight (gauge). - Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). The metrics are structured as: dataset.<input-id>: id=<input id> input=aws-s3 sqs_messages_received_total sqs_visibility_timeout_extensions_total sqs_messages_inflight_gauge sqs_messages_returned_total sqs_messages_deleted_total sqs_message_processing_time.histogram s3_objects_requested_total s3_bytes_processed_total s3_events_created_total s3_objects_inflight_gauge s3_object_processing_time.histogram The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit `id` is not set in the input config. (cherry picked from commit d3a03b0) # Conflicts: # x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc # x-pack/filebeat/input/awss3/collector.go
* Instrument aws-s3 with metrics Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added. - Number of SQS messages received (not necessarily processed fully). - Number of SQS visibility timeout extensions. - Number of SQS messages inflight (gauge). - Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). - Number of SQS messages deleted. - Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). - Number of S3 objects downloaded. - Number of S3 bytes processed. - Number of events created from processing S3 data. - Number of S3 objects inflight (gauge). - Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). The metrics are structured as: dataset.<input-id>: id=<input id> input=aws-s3 sqs_messages_received_total sqs_visibility_timeout_extensions_total sqs_messages_inflight_gauge sqs_messages_returned_total sqs_messages_deleted_total sqs_message_processing_time.histogram s3_objects_requested_total s3_bytes_processed_total s3_events_created_total s3_objects_inflight_gauge s3_object_processing_time.histogram The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit `id` is not set in the input config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andrewkroh thank you so much for adding these metrics!! This will definitely make debugging aws-s3 input a lot more easier next time!! One nit: what do you think about using dot .
in these metric names? For example:
sqs_messages_received_total
-> sqs_messages.received
or sqs_messages_received.total
?
Ahh sorry I didn't see this is merged already. Please ignore my question then. Thank you again!!! |
* Instrument aws-s3 with metrics Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added. - Number of SQS messages received (not necessarily processed fully). - Number of SQS visibility timeout extensions. - Number of SQS messages inflight (gauge). - Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). - Number of SQS messages deleted. - Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). - Number of S3 objects downloaded. - Number of S3 bytes processed. - Number of events created from processing S3 data. - Number of S3 objects inflight (gauge). - Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). The metrics are structured as: dataset.<input-id>: id=<input id> input=aws-s3 sqs_messages_received_total sqs_visibility_timeout_extensions_total sqs_messages_inflight_gauge sqs_messages_returned_total sqs_messages_deleted_total sqs_message_processing_time.histogram s3_objects_requested_total s3_bytes_processed_total s3_events_created_total s3_objects_inflight_gauge s3_object_processing_time.histogram The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit `id` is not set in the input config. Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Hi @andrewkroh and @kaiyan-sheng i have Metricbeat monitoring setup like below but cant seem to view the S3 related metrics in Kibana.
Thanks in advance! |
What does this PR do?
Diagnosing performance issues with the aws-s3 input is difficult so this instruments it with some metrics to make this easier. These are the metrics that are added.
The metrics are structured as:
The v2 input logger was updated to include the input ID to make correlation with metrics possible when an explicit
id
is not set in the input configWhy is it important?
These metrics will make it easier to operating and tune the aws-s3 input.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Logs
curl http://<http.host>:<http.port>/dataset?pretty