-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(routes): many requests for outdated routes by rate limiting #675
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #675 +/- ##
==========================================
+ Coverage 72.10% 72.26% +0.16%
==========================================
Files 32 32
Lines 2667 2668 +1
==========================================
+ Hits 1923 1928 +5
+ Misses 555 552 -3
+ Partials 189 188 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit afraid of the unknown side effects of this workaround. I think we can merge this, but we might need to come up with a better fix in the future.
Fixes #673 In `routes.ListRoutes()` we have to find the matching server/node for every route in the network. We find the server by utilizing a cache that maps every private IP to the corresponding server. This cache has a feature that refreshes the list of servers if an entry can not be found. This is sensible, as the server might have been just created. This is also fatal, as this refresh happens for every single cache access. If there are a lot of routes in the network that do not belong to any server, we refresh the cache many times for each `ListRoutes()`. This is even more serious, as `ListRoutes()` is being called every 10-30 seconds (see #395). This commit introduces a `rate.Limiter` in the `AllServersCache` which only allows the refresh to happen every 30 seconds.
I rebased onto main to resolve the go.mod conflict. |
The primary side effect here is that new nodes might need additional 30s until routes are setup, if the project already had wonky routes. |
🤖 I have created a release *beep* *boop* --- ## [1.20.0](v1.19.0...v1.20.0) (2024-07-08) ### Features * add support & tests for Kubernetes 1.29 ([#600](#600)) ([e8fabda](e8fabda)) * add support & tests for Kubernetes 1.30 ([#679](#679)) ([0748b6e](0748b6e)) * drop tests for kubernetes v1.25 ([#597](#597)) ([58261ec](58261ec)) * drop tests for kubernetes v1.26 ([#680](#680)) ([9c4be01](9c4be01)) * emit Kubernetes events for error conditions ([#598](#598)) ([e8f9199](e8f9199)) * **helm,manifests:** only specify container args instead of command ([#691](#691)) ([2ba4058](2ba4058)) * **helm:** allow setting affinity for deployment ([#686](#686)) ([1a8ea95](1a8ea95)) * read HCLOUD_TOKEN from file ([#652](#652)) ([a4343b8](a4343b8)) ### Bug Fixes * **routes:** many requests for outdated routes by rate limiting ([#675](#675)) ([e283b7d](e283b7d)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Fixes #673
In
routes.ListRoutes()
we have to find the matching server/node for every route in the network. We find the server by utilizing a cache that maps every private IP to the corresponding server.This cache has a feature that refreshes the list of servers if an entry can not be found. This is sensible, as the server might have been just created. This is also fatal, as this refresh happens for every single cache access. If there are a lot of routes in the network that do not belong to any server, we refresh the cache many times for each
ListRoutes()
. This is even more serious, asListRoutes()
is being called every 10-30 seconds (see #395).This commit introduces a
rate.Limiter
in theAllServersCache
which only allows the refresh to happen every 30 seconds.