-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(new sink): aws sqs #4675
feat(new sink): aws sqs #4675
Conversation
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to see this resurrected!
In general this looks pretty good. I have a few questions:
With regards to our use of the SendMessageBatch
API here:
- For
SendMessageBatch
, AWS can return that it only ingested some of the messages (https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_SendMessageBatch.html). It looks like we are just taking any successful response as an indication that all of the messages in the batch were processed, but it could be the case that 1 or more of them errored and should be reprocessed. - I realize the risk of collision is low, but I think it'd be better to ensure that the id of each message in the batch is unique. You could just number 0-9 since the ids only need to be unique within a batch.
If we don't want to take on the complexity of handling partially processed batches right now, we could opt to just use SendMessage
. This will result in slower processing, of course.
For FIFO queues, I think it is a bit odd to use a unique message_group_id for each batch of messages as this could result in out-of-order processing of messages by consumers of the SQS queue. I might make this option configurable by the user so that they could set it to something like the file name (if they are using the file
source) to ensure that they process the messages for a given file in-order. We could default it to just vector
or maybe the hostname
.
Can you explain how this handles failed SQS messages that are not retried? Will it leave the messages in the buffer for that sink to be retried later?
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! This is much simpler as an initial implementation.
I left a few more comments, but nothing major.
src/internal_events/aws_sqs.rs
Outdated
use metrics::counter; | ||
|
||
#[derive(Debug)] | ||
pub struct AwsSqsEventSent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need an event for failed sends too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you think, should we log only RusotoError::Service
in the event?
} | ||
|
||
configuration: { | ||
queue_url: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurred to me that I have different SQS configuration in the aws_s3
source #4779 where, instead of queue_url
, I have name
and owner_id
and then use GetQueueUrl
. We should probably do one way or the other consistently. I'll change mine since it results in one less configuration option.
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
42aca6d
to
5ec3e0c
Compare
3afdb1b
to
7be3e1f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making all of the updates! Just a few last comments, but this is close.
match error { | ||
RusotoError::HttpDispatch(_) => true, | ||
RusotoError::Unknown(res) if res.status.is_server_error() => true, | ||
_ => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we handle AWS error codes here too? Throttling and service unavailable.
It looks like rusoto does not handle this transparently like some other AWS SDKs do: rusoto/rusoto#234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that we can do more that other aws sinks do...
I also noted that only aws_cloudwatch_metrics
handle http::StatusCode::TOO_MANY_REQUESTS
. I think we should fix other aws sinks (in another PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 You are right. I didn't realize that the "common" AWS errors are not propagated: rusoto/rusoto#605
We'll need to wait for that issue to do more.
Noting for posterity that I did try this out locally with a test SQS queue and it looks good. |
Signed-off-by: Kirill Fomichev <fanatid@ya.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks for all of your work on this.
Nice work @fanatid |
Signed-off-by: Kirill Fomichev <fanatid@ya.ru> Signed-off-by: casserni <nicholascassera@gmail.com>
Signed-off-by: Kirill Fomichev <fanatid@ya.ru> Signed-off-by: Brian Menges <brian.menges@anaplan.com>
The successor of #2755.
Sorry, @francescop it was hard to rebase/merge, so I copied your code from PR (hope that's ok?).