Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix aws rds discovery invalid engine filter #18328

Merged

Conversation

GavinFrazar
Copy link
Contributor

Closes #17533

What

This PR changes AWS RDS Discovery to handle unrecognized engine names in filters gracefully.

How

When AWS returns an error like Unrecognized engine name: aurora, we detect this as an engine name filter error and fallback to querying for all DB engine versions in the region. Then we filter Teleport-supported engine names that are not found in the results, and retry the describe instances/clusters RDS API calls with the reduced filter engine names. If this stilll fails with unrecognized engine name (never seen this/don't know if it can actually happen), then the fetcher returns an error and we skip that fetcher - this "unrecognized engine" error has been observed to be region-specific and intermittent - we should not stop the watcher just because a single fetcher in one region failed.

Tests

A great deal of this PR is just tests and mocking changes. I had to mock the DescribeDBEngineVersionsPagesWithContext API and I also needed to make our Describe* mock APIs respect engine filters - this was needed to verify behavior in tests.

I fixed up the watcher tests and added two new tests: one for handling unrecognized engines gracefully with a retry, and one for verifying that when the retry fails we just skip the fetcher instead of quitting.

@github-actions github-actions bot requested review from mdwn and r0mant November 10, 2022 03:05
@github-actions github-actions bot added the database-access Database access related issues and PRs label Nov 10, 2022
@smallinsky smallinsky requested a review from greedy52 November 10, 2022 14:15
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/common/errors.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/mocks.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/mocks.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/mocks.go Outdated Show resolved Hide resolved
@greedy52
Copy link
Contributor

greedy52 commented Nov 10, 2022

A new API also means we have to update IAM permissions for db configurator tool and we have to ask existing users to add the new permission. I think in this case it's not that bad since only ppl with this problem need to update their agent's IAM permissions.

One alternative i think we can just call the current API, but with one engine type per call. We triple the number of calls but it's not that many compared to some other discovery we have to make calls per db instance. What do you think?

I am also ok with the current approach if we decided to do the fallback by checking supported engines. Make sure to update the configurator.

(-- update, or maybe the fallback can just do the separate calls with the same API, without calling the get engine version)

@GavinFrazar
Copy link
Contributor Author

A new API also means we have to update IAM permissions for db configurator tool and we have to ask existing users to add the new permission. I think in this case it's not that bad since only ppl with this problem need to update their agent's IAM permissions.

One alternative i think we can just call the current API, but with one engine type per call. We triple the number of calls but it's not that many compared to some other discovery we have to make calls per db instance. What do you think?

I am also ok with the current approach if we decided to do the fallback by checking supported engines. Make sure to update the configurator.

(-- update, or maybe the fallback can just do the separate calls with the same API, without calling the get engine version)

Good point about the permissions. I think this solution is better for the happy-path in RDS discovery since the alternative triples the number of API calls always, but does requires the extra permissions... I think I'd prefer we just add this permission - I can't think of a reason why any customer would be opposed to allow it

@greedy52
Copy link
Contributor

greedy52 commented Nov 10, 2022

I can't think of a reason why any customer would be opposed to allow it

No, i don't think it's a big deal either for customer to grant us permission for that. Just for existing users they probably don't know they are missing a permission. well, i guess if they happen to run into this problem and observe the log, they will know eventually =p.

Copy link
Contributor

@greedy52 greedy52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets also add the permission to the database configurator

// rdsInstancesActions list of actions used when giving RDS instances permissions.
rdsInstancesActions = []string{"rds:DescribeDBInstances", "rds:ModifyDBInstance"}
// auroraActions list of actions used when giving RDS Aurora permissions.
auroraActions = []string{"rds:DescribeDBClusters", "rds:ModifyDBCluster"}

lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/watcher.go Outdated Show resolved Hide resolved
@GavinFrazar
Copy link
Contributor Author

GavinFrazar commented Nov 15, 2022

average fetcher time when it has to fallback to describe db engine versions: ~7 seconds on my computer. Querying each filter individually: ~0.5 seconds average fetcher time (granted, we only have a few db instances in our aws dev account).

@greedy52 I think based on this huge disparity you're probably right - we should just query one filter at a time. 1 API call -> 3 API calls in the "happy-path" of the code. We could try all of them and then just do each filter individually as a fallback if that fails, to keep the happy-path just as fast as it is now: 1 API call in happy-path, 4 API calls in unhappy-path. That seems like a more reasonable approach, and a lot simpler.

edit: and no extra permissions needed, as you also pointed out. Feels bad scrapping so much of this work though :(

@GavinFrazar GavinFrazar force-pushed the gavinfrazar/fix-aws-rds-discovery-invalid-engine-filter branch from f75cf82 to 2b55f14 Compare November 16, 2022 01:06
@GavinFrazar
Copy link
Contributor Author

GavinFrazar commented Nov 16, 2022

@smallinsky @greedy52 I've refactored to remove the DescribeDBEngineVersions fallback, in favor of Steve's suggestion to just try each engine filter one at a time. To preserve prior performance, I first try all the engine filters and then fallback to trying them one at a time if that fails. This means 1 API call, same as before.

In the (very unlikely, apparently) event that a filter name is not recognized, we fall back to trying each engine filter one a time which means for the least likely case we do 4 API calls (first attempt + attempt engine1 + attempt engine2 + attempt engine3), whereas with the DescribeDBEngineVersions fallback we were doing 3 (first attempt + query engine versions + reattempt).
From my measurements, the 4 API calls are still much faster than querying hundreds of DB engine versions just to check the 3 we care about :)

Copy link
Contributor

@greedy52 greedy52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this 😄. I don't think time is wasted as now we know how the engine version thing works it will likely be handy in the future.

lib/srv/db/cloud/watchers/rds.go Show resolved Hide resolved
lib/srv/db/common/errors.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/mocks.go Outdated Show resolved Hide resolved
lib/srv/db/cloud/watchers/rds.go Show resolved Hide resolved
@github-actions github-actions bot removed request for r0mant and mdwn November 16, 2022 15:46
GavinFrazar and others added 2 commits November 16, 2022 10:46
Co-authored-by: Marek Smoliński <marek@goteleport.com>
@GavinFrazar GavinFrazar enabled auto-merge (squash) November 17, 2022 19:39
@GavinFrazar GavinFrazar merged commit 65e3512 into master Nov 17, 2022
@github-actions
Copy link

@GavinFrazar See the table below for backport results.

Branch Result
branch/v10 Failed
branch/v11 Create PR
branch/v9 Failed

@GavinFrazar GavinFrazar deleted the gavinfrazar/fix-aws-rds-discovery-invalid-engine-filter branch November 17, 2022 22:09
GavinFrazar added a commit that referenced this pull request Nov 18, 2022
* Add helper func to check for AWS unrecognized engine error

* Retry RDS queries with individual filters if engine name is unrecognized

* godoc

* pass engine names instead of filters, to make retry logic simpler

* Update mocks to check/apply RDS engine filters

* Update watcher tests

* Add watcher test for unrecognized RDS engines

* Update lib/srv/db/cloud/mocks.go

Co-authored-by: Marek Smoliński <marek@goteleport.com>

Co-authored-by: Marek Smoliński <marek@goteleport.com>
GavinFrazar added a commit that referenced this pull request Nov 18, 2022
* Add helper func to check for AWS unrecognized engine error

* Retry RDS queries with individual filters if engine name is unrecognized

* godoc

* pass engine names instead of filters, to make retry logic simpler

* Update mocks to check/apply RDS engine filters

* Update watcher tests

* Add watcher test for unrecognized RDS engines

* Update lib/srv/db/cloud/mocks.go

Co-authored-by: Marek Smoliński <marek@goteleport.com>

Co-authored-by: Marek Smoliński <marek@goteleport.com>
GavinFrazar added a commit that referenced this pull request Nov 22, 2022
* fix aws rds discovery invalid engine filter (#18328)

* Add helper func to check for AWS unrecognized engine error

* Retry RDS queries with individual filters if engine name is unrecognized

* godoc

* pass engine names instead of filters, to make retry logic simpler

* Update mocks to check/apply RDS engine filters

* Update watcher tests

* Add watcher test for unrecognized RDS engines

* Update lib/srv/db/cloud/mocks.go

Co-authored-by: Marek Smoliński <marek@goteleport.com>

Co-authored-by: Marek Smoliński <marek@goteleport.com>

* Fix goimports

Co-authored-by: Marek Smoliński <marek@goteleport.com>
GavinFrazar added a commit that referenced this pull request Dec 30, 2022
* fix aws rds discovery invalid engine filter (#18328)

* Add helper func to check for AWS unrecognized engine error

* Retry RDS queries with individual filters if engine name is unrecognized

* godoc

* pass engine names instead of filters, to make retry logic simpler

* Update mocks to check/apply RDS engine filters

* Update watcher tests

* Add watcher test for unrecognized RDS engines

* Update lib/srv/db/cloud/mocks.go

Co-authored-by: Marek Smoliński <marek@goteleport.com>

Co-authored-by: Marek Smoliński <marek@goteleport.com>

* Fix lint errors

Co-authored-by: Marek Smoliński <marek@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database-access Database access related issues and PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS RDS Discovery fails with invalid parameter in AWS API Request
3 participants