Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple fixes for k8s forwarder #5038

Merged
merged 5 commits into from
Dec 8, 2020
Merged

Multiple fixes for k8s forwarder #5038

merged 5 commits into from
Dec 8, 2020

Conversation

awly
Copy link
Contributor

@awly awly commented Dec 2, 2020

  1. init session uploader in kubernetes service startup code
    without this, session upload directory is not getting created and causes all recordings to fail

  2. move the caching layer from then entire clusterSession to just the ephemeral user certs
    generating user certs on the fly is the expensive part we must cache
    the rest of clusterSession has some fields that require extra handling for eviction: for example remote cluster or kubernetes_service tunnels can disappear; caching the entire clusterSession has little benefit performance-wise

  3. use process context for emitting audit events, not request context
    request context can get cancelled by client disconnecting, losing us session.end events

  4. clean up the code a bit, mostly get rid of all the embedding

Updates #5014

lib/kube/proxy/forwarder.go Show resolved Hide resolved
lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved
@@ -81,6 +81,12 @@ func (process *TeleportProcess) initKubernetesService(log *logrus.Entry, conn *C
return trace.Wrap(err)
}

// Start uploader that will scan a path on disk and upload completed
// sessions to the Auth Server.
if err := process.initUploaderService(accessPoint, conn.Client); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been moved here from somewhere or was it just missing and it was an error/omission?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was missing here due to omission

@awly awly force-pushed the andrew/kube-forwarder-fixes branch from dc8ee86 to ea847ed Compare December 4, 2020 22:25
Andrew Lytvynov added 4 commits December 8, 2020 09:19
Using the request context can prevent audit events from getting emitted,
if client disconnected and request context got closed.
We shouldn't be losing audit events like that.

Also, log all response errors from exec handler.
Rename a few config fields to be more descriptive.
Avoid embedding unless necessary, to keep the package API clean.
The expensive part that we need to cache is the client certificate.
Making a new one requires a round-trip to the auth server, plus entropy
for crypto operations.

The rest of clusterSession contains request-specific state, and only
adds problems if cached.
For example: clusterSession stores a reference to a remote teleport
cluster (if needed); caching requires extra logic to invalidate the
session when that cluster disappears (or tunnels drop out). Same problem
happens with kubernetes_service tunnels.

Instead, the forwarder now picks a new target for each request from the
same user, providing a kind of "load-balancing".
It's started in all other services that upload sessions (app/proxy/ssh),
but was missing here. Because of this, the session storage directory for
async uploads wasn't created on disk and caused interactive sessions to
fail.
@awly awly force-pushed the andrew/kube-forwarder-fixes branch from ea847ed to 9635274 Compare December 8, 2020 17:24
@awly awly merged commit 3fa6904 into master Dec 8, 2020
@awly awly deleted the andrew/kube-forwarder-fixes branch December 8, 2020 19:12
awly pushed a commit that referenced this pull request Dec 8, 2020
* kube: emit audit events using process context

Using the request context can prevent audit events from getting emitted,
if client disconnected and request context got closed.
We shouldn't be losing audit events like that.

Also, log all response errors from exec handler.

* kube: cleanup forwarder code

Rename a few config fields to be more descriptive.
Avoid embedding unless necessary, to keep the package API clean.

* kube: cache only user certificates, not the entire session

The expensive part that we need to cache is the client certificate.
Making a new one requires a round-trip to the auth server, plus entropy
for crypto operations.

The rest of clusterSession contains request-specific state, and only
adds problems if cached.
For example: clusterSession stores a reference to a remote teleport
cluster (if needed); caching requires extra logic to invalidate the
session when that cluster disappears (or tunnels drop out). Same problem
happens with kubernetes_service tunnels.

Instead, the forwarder now picks a new target for each request from the
same user, providing a kind of "load-balancing".

* Init session uploader in kubernetes service

It's started in all other services that upload sessions (app/proxy/ssh),
but was missing here. Because of this, the session storage directory for
async uploads wasn't created on disk and caused interactive sessions to
fail.
awly pushed a commit that referenced this pull request Dec 10, 2020
* kube: emit audit events using process context

Using the request context can prevent audit events from getting emitted,
if client disconnected and request context got closed.
We shouldn't be losing audit events like that.

Also, log all response errors from exec handler.

* kube: cleanup forwarder code

Rename a few config fields to be more descriptive.
Avoid embedding unless necessary, to keep the package API clean.

* kube: cache only user certificates, not the entire session

The expensive part that we need to cache is the client certificate.
Making a new one requires a round-trip to the auth server, plus entropy
for crypto operations.

The rest of clusterSession contains request-specific state, and only
adds problems if cached.
For example: clusterSession stores a reference to a remote teleport
cluster (if needed); caching requires extra logic to invalidate the
session when that cluster disappears (or tunnels drop out). Same problem
happens with kubernetes_service tunnels.

Instead, the forwarder now picks a new target for each request from the
same user, providing a kind of "load-balancing".

* Init session uploader in kubernetes service

It's started in all other services that upload sessions (app/proxy/ssh),
but was missing here. Because of this, the session storage directory for
async uploads wasn't created on disk and caused interactive sessions to
fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants