-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add monitor for event-based reconciliation #2048
base: master
Are you sure you want to change the base?
Conversation
3923e8b
to
6579c22
Compare
Signed-off-by: Max Melentyev <max.melentyev@reddit.com>
6579c22
to
5eb060e
Compare
Crossplane does not currently have enough maintainers to address every issue and pull request. This pull request has been automatically marked as |
/fresh |
|
Crossplane does not currently have enough maintainers to address every issue and pull request. This pull request has been automatically marked as |
|
/fresh |
While I understand the issue with AWS request throttling, especially regarding Route53, I am very reluctant on merging this as it feels like a very specialized implementation tailored to a very specific use case. With this we would make the provider depend on external infrastructure, such as SQS queues to do its job. Given that a provider can potentially target multiple AWS accounts this solution might not work in every scenario. Additionally, the connection to the queue relies on the IRSA credentials injected into the pod. However, there is no guarantee that this will work since a provider / controller receives its credentials through the provider configs referenced in each individual managed resource which also is not necessarily IRSA but can be a file source as well. Overall I don't we are gonna merge this. Again, I do think that throttling is a relevant issue but I think it is better to try to solve this with K8s tooling or at least something that is kinda agnostic to the underlying technology (i.e. SQS). |
Hey @MisterMX, let me address your concerns. But first I'd add that we have provider-aws managing 10k instances and 13k resourcerecordsets in multiple AWS accounts for more than 6 months. Before switching to event-based reconciliation we started seeing issues with 2k resources, queue waiting times were more than 10m.
Event-based reconciliation is optional. It's enabled only If SQS queue address is provided.
It's possible to have a multi-account setup: managed accounts stream stream EventBridge events to an account where provider-aws runs, and SQS is configured only in one account. I can add more details to readme or terraform examples/modules.
Provider config is not used to access SQS. As there's a single SQS, IAM policies can be configured to allow access from provider-aws pod. |
This is a preview-PR, it's not supposed to be merged now.
It's to show a feature that requires crossplane/crossplane#5557 and #2030
Related to #2029
Description of your changes
This PR introduces a monitor of EventBridge events that are published to SQS queue. This allows to keep provider-aws responsive to external changes even when it manages large number of resources that takes long time to reconcile using polling approach. With event based reconciliation polling interval can be increased but resources will still be reconciled when they have updates.
This PR only enables support for ec2.Instance and route53.ResourceRecordSet events. Event based reconciliation is optional and works along with polling.
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested