Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove refetching from resourceWatcher #14262

Merged
merged 3 commits into from
Jul 11, 2022

Conversation

rosstimothy
Copy link
Contributor

The resourceWatcher is meant to be a long lived way for a component
to receive events about a particular resource from an upstream cache.
However, there was a refetching mechanism that would cause a healthy
and subscribed watcher to be closed, the resourceQatcher to fetch all
the resource types it is watching from the upstream cache and to create a
new watcher every 10 minutes. This causes unneeded load on
the upstream cache and also eats up network bandwidth.

This removes the refetching behavior entirely to ensure watchers
aren't unnecessarily closed. The change should be transparent to
users of the resourceWatcher, but should noticeably reduce both
the number of init events being emitted through out a cluster
and the number of cache reads.

Fixes #14234

The resourceWatcher is meant to be a long lived way for a component
to receive events about a particular resource from an upstream cache.
However, there was a refetching mechanism that would cause a healthy
and subscribed watcher to be closed, the resourceWatcher to fetch all
the resource types it is watching from the upstream cache and to create a
new watcher **every 10 minutes**. This causes unneeded load on
the upstream cache and also eats up network bandwidth.

This removes the refetching behavior entirely to ensure watchers
aren't unnecessarily closed. The change should be transparent to
users of the resourceWatcher, but should noticeably reduce both
the number of init events being emitted through out a cluster
and the number of cache reads.

Fixes #14234
@rosstimothy
Copy link
Contributor Author

Metrics from running a 10k cluster off this branch. Compare against the snapshot from #14234 to see the reduction in init and cache reads.
Screenshot 2022-07-08 at 17-27-53 Teleport Tenant latest - Grafana

Copy link
Contributor

@fspmarshall fspmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

red diffs ❤️

@rosstimothy rosstimothy marked this pull request as ready for review July 8, 2022 21:33
@github-actions github-actions bot requested review from tcsc and zmb3 July 8, 2022 21:34
@rosstimothy rosstimothy enabled auto-merge (squash) July 11, 2022 12:20
@rosstimothy rosstimothy merged commit dea633f into master Jul 11, 2022
@github-actions
Copy link

@rosstimothy See the table below for backport results.

Branch Result
branch/v10 Create PR
branch/v7 Failed
branch/v8 Create PR
branch/v9 Create PR

rosstimothy added a commit that referenced this pull request Jul 11, 2022
The resourceWatcher is meant to be a long lived way for a component
to receive events about a particular resource from an upstream cache.
However, there was a refetching mechanism that would cause a healthy
and subscribed watcher to be closed, the resourceWatcher to fetch all
the resource types it is watching from the upstream cache and to create a
new watcher **every 10 minutes**. This causes unneeded load on
the upstream cache and also eats up network bandwidth.

This removes the refetching behavior entirely to ensure watchers
aren't unnecessarily closed. The change should be transparent to
users of the resourceWatcher, but should noticeably reduce both
the number of init events being emitted through out a cluster
and the number of cache reads.

Fixes #14234

(cherry picked from commit dea633f)

# Conflicts:
#	lib/services/watcher.go
@rosstimothy
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
branch/v7

Questions ?

Please refer to the Backport tool documentation

rosstimothy added a commit that referenced this pull request Jul 12, 2022
Remove refetching from resourceWatcher (#14262)

The resourceWatcher is meant to be a long lived way for a component
to receive events about a particular resource from an upstream cache.
However, there was a refetching mechanism that would cause a healthy
and subscribed watcher to be closed, the resourceWatcher to fetch all
the resource types it is watching from the upstream cache and to create a
new watcher **every 10 minutes**. This causes unneeded load on
the upstream cache and also eats up network bandwidth.

This removes the refetching behavior entirely to ensure watchers
aren't unnecessarily closed. The change should be transparent to
users of the resourceWatcher, but should noticeably reduce both
the number of init events being emitted through out a cluster
and the number of cache reads.

Fixes #14234

(cherry picked from commit dea633f)

# Conflicts:
#	lib/services/watcher.go
@zmb3 zmb3 deleted the tross/persistent_resource_watcher branch September 9, 2022 18:55
rosstimothy added a commit that referenced this pull request Nov 2, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsytem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 3, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsytem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 4, 2022
* Periodically resync proxies to agents

Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
github-actions bot pushed a commit that referenced this pull request Nov 4, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsytem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 4, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 4, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 4, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 7, 2022
* Periodically resync proxies to agents (#18050)

Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 7, 2022
* Periodically resync proxies to agents

Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsytem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 7, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
rosstimothy added a commit that referenced this pull request Nov 7, 2022
Prior to #14262, resource watchers would periodically close their watcher,
create a new one and refetch the current set of resources. It turns out
that the reverse tunnel subsystem relied on this behavior to periodically
broadcast the list of proxies to agents during steady state. Now that
watchers are persistent and no longer perform a refetch, agents that are
unable to connect to a proxy expire them after a period of time, and
since they never receive the periodic refresh, they never attempt to
connect to said proxy again.

To remedy this, a new ticker is added to the `localsite` that grabs
the current set of proxies from its proxy watcher and sends a discovery
request to the agent. The frequency of the ticker is set to fire
prior to the tracker would expire the proxy so that if a proxy exists
in the cluster, then the agent will continually try to connect to it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Periodic cache init events in healthy cluster
4 participants