-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodically resync proxies to agents #18050
Conversation
d7dbb72
to
3523747
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much data is this, given the relatively inefficient marshaling of the mostly-empty ServerV2
?
The table below shows the size of a marshaled
|
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsytem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
5bb90a5
to
66e48d3
Compare
b46bd52
to
66523d9
Compare
15e4581
to
fdde20d
Compare
@rosstimothy See the table below for backport results.
|
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
* Periodically resync proxies to agents (#18050) Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsystem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again. To remedy this, a new ticker is added to the `localsite` that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.
Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues.
* Improve site and trusted cluster logging and availability Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`.
Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues.
Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`.
* Improve site and trusted cluster logging and availability Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`.
* Improve site and trusted cluster logging and availability Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`. * update metrics docs * Update docs/pages/includes/metrics.mdx Co-authored-by: Alex Fornuto <alex.fornuto@goteleport.com> Co-authored-by: Alex Fornuto <alex.fornuto@goteleport.com>
* Improve site and trusted cluster logging and availability Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`.
Moves `UpdateTrustedCluster` logging from debug to info so default logging level includes when admin operations are performed to establish or remove trust. Alters `remoteSite` such that it logs in the same manner as `localSite` Cherry-picks some of the availability changes made in #18050 to ensure that agents spawned for trusted clusters are more robust to connection issues. * Ensure metric `remote_cluster` reflects current state The metric wasn't properly updated when remote sites went offline of when remote cluster resources were removed. Any change to the remoteSite state or the remoteCluster resource are now accurately reflected in the metric. * Add tracking of outbound connections to remote clusters The metric `trust_clusters` existed and was exported, but was never used anywhere. Now when the `RemoteClusterTunnelManager` starts and stops agent pools it will create and delete a counter for the cluster. Within the `AgentPool` the metric is set to the number of connected proxies within `updateConnectedProxies`.
Prior to #14262, resource watchers would periodically close their watcher, create a new one and refetch the current set of resources. It turns out that the reverse tunnel subsytem relied on this behavior to periodically broadcast the list of proxies to agents during steady state. Now that watchers are persistent and no longer perform a refetch, agents that are unable to connect to a proxy expire them after a period of time, and since they never receive the periodic refresh, they never attempt to connect to said proxy again.
To remedy this, a new ticker is added to the
localsite
that grabs the current set of proxies from its proxy watcher and sends a discovery request to the agent. The frequency of the ticker is set to fire prior to the tracker would expire the proxy so that if a proxy exists in the cluster, then the agent will continually try to connect to it.