-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a functional prototype #2
Milestone
Comments
Merged
awly
pushed a commit
that referenced
this issue
Jul 17, 2020
When running `tsh ssh foo@bar cmd` we end up dialing `bar` twice - once to (maybe) start port forwarding and a second time to execute `cmd`. Instead, reuse the first connection to run `cmd` and only fall back to re-dialing if we're matching multiple nodes by label. This gives ~20-30% speedup for non-interactive commands (useful for tools like ansible): ``` > hyperfine 'tsh ssh localhost true' '~/src/teleport/build/tsh ssh localhost true' Benchmark #1: tsh ssh localhost true Time (mean ± σ): 65.5 ms ± 5.0 ms [User: 12.9 ms, System: 6.1 ms] Range (min … max): 57.0 ms … 74.2 ms 41 runs Benchmark #2: ~/src/teleport/build/tsh ssh localhost true Time (mean ± σ): 51.7 ms ± 3.2 ms [User: 9.0 ms, System: 5.0 ms] Range (min … max): 48.5 ms … 68.5 ms 57 runs Summary '~/src/teleport/build/tsh ssh localhost true' ran 1.27 ± 0.12 times faster than 'tsh ssh localhost true' ```
awly
pushed a commit
that referenced
this issue
Jul 20, 2020
When running `tsh ssh foo@bar cmd` we end up dialing `bar` twice - once to (maybe) start port forwarding and a second time to execute `cmd`. Instead, reuse the first connection to run `cmd` and only fall back to re-dialing if we're matching multiple nodes by label. This gives ~20-30% speedup for non-interactive commands (useful for tools like ansible): ``` > hyperfine 'tsh ssh localhost true' '~/src/teleport/build/tsh ssh localhost true' Benchmark #1: tsh ssh localhost true Time (mean ± σ): 65.5 ms ± 5.0 ms [User: 12.9 ms, System: 6.1 ms] Range (min … max): 57.0 ms … 74.2 ms 41 runs Benchmark #2: ~/src/teleport/build/tsh ssh localhost true Time (mean ± σ): 51.7 ms ± 3.2 ms [User: 9.0 ms, System: 5.0 ms] Range (min … max): 48.5 ms … 68.5 ms 57 runs Summary '~/src/teleport/build/tsh ssh localhost true' ran 1.27 ± 0.12 times faster than 'tsh ssh localhost true' ```
awly
pushed a commit
that referenced
this issue
Feb 18, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Feb 22, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Feb 26, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Mar 3, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Mar 5, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Mar 10, 2021
This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing
awly
pushed a commit
that referenced
this issue
Mar 10, 2021
* mfa: per-session MFA certs for SSH and Kubernetes This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing * Capture access approvals on MFA-issued certs * Address review feedback * Address review feedback * mfa: accept unknown nodes during short-term MFA cert creation An unknown node could be an OpenSSH node set up via https://goteleport.com/teleport/docs/openssh-teleport/ In this case, we shouldn't prevent the user from connecting. There's a small risk of authz bypass - an attacker might know a different name/IP for a registered node which Teleport doesn't know about. But a Teleport node will still check RBAC and reject the connection. * Validate username against unmapped user identity IssueUserCertsWithMFA is called on the leaf auth server in case of trusted clusters. Username in the request object will be that of the original unmapped caller. * mfa: add IsMFARequired RPC This RPC is ran before every connection to check whether MFA is required. If a connection is against the leaf cluster, this request is forwarded from root to leaf for evaluation. * Fix integration tests * Correctly treat "Username" as login name in IsMFARequired Also, move the logic into auth.Server out of ServerWithRoles. * Fix TestHA * Address review feedback
awly
pushed a commit
that referenced
this issue
Mar 25, 2021
* mfa: per-session MFA certs for SSH and Kubernetes This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing * Capture access approvals on MFA-issued certs * Address review feedback * Address review feedback * mfa: accept unknown nodes during short-term MFA cert creation An unknown node could be an OpenSSH node set up via https://goteleport.com/teleport/docs/openssh-teleport/ In this case, we shouldn't prevent the user from connecting. There's a small risk of authz bypass - an attacker might know a different name/IP for a registered node which Teleport doesn't know about. But a Teleport node will still check RBAC and reject the connection. * Validate username against unmapped user identity IssueUserCertsWithMFA is called on the leaf auth server in case of trusted clusters. Username in the request object will be that of the original unmapped caller. * mfa: add IsMFARequired RPC This RPC is ran before every connection to check whether MFA is required. If a connection is against the leaf cluster, this request is forwarded from root to leaf for evaluation. * Fix integration tests * Correctly treat "Username" as login name in IsMFARequired Also, move the logic into auth.Server out of ServerWithRoles. * Fix TestHA * Address review feedback
awly
pushed a commit
that referenced
this issue
Mar 29, 2021
* mfa: per-session MFA certs for SSH and Kubernetes This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing * Capture access approvals on MFA-issued certs * Address review feedback * Address review feedback * mfa: accept unknown nodes during short-term MFA cert creation An unknown node could be an OpenSSH node set up via https://goteleport.com/teleport/docs/openssh-teleport/ In this case, we shouldn't prevent the user from connecting. There's a small risk of authz bypass - an attacker might know a different name/IP for a registered node which Teleport doesn't know about. But a Teleport node will still check RBAC and reject the connection. * Validate username against unmapped user identity IssueUserCertsWithMFA is called on the leaf auth server in case of trusted clusters. Username in the request object will be that of the original unmapped caller. * mfa: add IsMFARequired RPC This RPC is ran before every connection to check whether MFA is required. If a connection is against the leaf cluster, this request is forwarded from root to leaf for evaluation. * Fix integration tests * Correctly treat "Username" as login name in IsMFARequired Also, move the logic into auth.Server out of ServerWithRoles. * Fix TestHA * Address review feedback
awly
pushed a commit
that referenced
this issue
Mar 29, 2021
* mfa: per-session MFA certs for SSH and Kubernetes This is client-side support for requesting single-use certs with an MFA check. The client doesn't know whether they need MFA check when accessing a resource, this is decided during an RBAC check on the server. So a client will always try to get a single-use cert, and the server will respond with NotNeeded if MFA is not required. This is an extra round-trip for every session which causes ~20% slowdown in SSH logins: ``` $ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date' Benchmark #1: /tmp/tsh-old ssh talos date Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms] Range (min … max): 48.4 ms … 54.1 ms 59 runs Benchmark #2: /tmp/tsh-new ssh talos date Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms] Range (min … max): 59.0 ms … 69.7 ms 50 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary '/tmp/tsh-old ssh talos date' ran 1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date' ``` Another few other internal changes: - client.LocalKeyAgent will now always have a non-nil LocalKeyStore. Previously, it would be nil (e.g. in a web UI handler or when using an identity file) which easily causes panics. I added a noLocalKeyStore type instead that returns errors from all methods. - requesting a user cert with a TTL < 1min will now succeed and return a 1min cert instead of failing * Capture access approvals on MFA-issued certs * Address review feedback * Address review feedback * mfa: accept unknown nodes during short-term MFA cert creation An unknown node could be an OpenSSH node set up via https://goteleport.com/teleport/docs/openssh-teleport/ In this case, we shouldn't prevent the user from connecting. There's a small risk of authz bypass - an attacker might know a different name/IP for a registered node which Teleport doesn't know about. But a Teleport node will still check RBAC and reject the connection. * Validate username against unmapped user identity IssueUserCertsWithMFA is called on the leaf auth server in case of trusted clusters. Username in the request object will be that of the original unmapped caller. * mfa: add IsMFARequired RPC This RPC is ran before every connection to check whether MFA is required. If a connection is against the leaf cluster, this request is forwarded from root to leaf for evaluation. * Fix integration tests * Correctly treat "Username" as login name in IsMFARequired Also, move the logic into auth.Server out of ServerWithRoles. * Fix TestHA * Address review feedback
Merged
inertial-frame
added a commit
that referenced
this issue
Jun 3, 2021
* docs: home section except for admin guide * docs: adminguide, ttl, oss * docs: corrected * docs: requested changes * docs: update product name for 6.2+ * docs: make ha references uniform * docs: make ha references uniform * docs: most of the reference section * docs: improve bash snippets * docs: add tiles to api landing * docs: typo * docs: bash corrections * docs: corrections * docs: update cli-docs
hatched
pushed a commit
to hatched/teleport-merge
that referenced
this issue
Nov 30, 2022
xacrimon
added a commit
that referenced
this issue
Mar 1, 2023
xacrimon
added a commit
that referenced
this issue
Mar 3, 2023
Closed
jentfoo
added a commit
that referenced
this issue
Apr 14, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403. Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited. The primary concern in this type of change would be if our rate limit becomes too aggressive for general use. There are two considered strategies to make sure this does not become impacting: 1. Adjust the rate limiter so the rate limit becomes endpoint specific. This would avoid the need to consider how activity on one endpoint effects another. 2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern. This commit chooses option #2. While #1 has advantages, particularly as endpoints and new use cases are added. #2 provides the strictest and safest rate limits. Our rate limits were configured to: period: 1 min avg rate: 10 burst rate: 20 In order to build a safety buffer with option #2 those allowed rates were doubled.
jentfoo
added a commit
that referenced
this issue
Apr 18, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403. Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited. The primary concern in this type of change would be if our rate limit becomes too aggressive for general use. There are two considered strategies to make sure this does not become impacting: 1. Adjust the rate limiter so the rate limit becomes endpoint specific. This would avoid the need to consider how activity on one endpoint effects another. 2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern. This commit chooses option #2. While #1 has advantages, particularly as endpoints and new use cases are added. #2 provides the strictest and safest rate limits. Our rate limits were configured to: period: 1 min avg rate: 10 burst rate: 20 In order to build a safety buffer with option #2 those allowed rates were doubled.
jentfoo
added a commit
that referenced
this issue
Apr 25, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403. Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited. The primary concern in this type of change would be if our rate limit becomes too aggressive for general use. There are two considered strategies to make sure this does not become impacting: 1. Adjust the rate limiter so the rate limit becomes endpoint specific. This would avoid the need to consider how activity on one endpoint effects another. 2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern. This commit chooses option #2. While #1 has advantages, particularly as endpoints and new use cases are added. #2 provides the strictest and safest rate limits. Our rate limits were configured to: period: 1 min avg rate: 10 burst rate: 20 In order to build a safety buffer with option #2 those allowed rates were doubled. Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added. This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests.
jentfoo
added a commit
that referenced
this issue
Apr 27, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403. Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited. The primary concern in this type of change would be if our rate limit becomes too aggressive for general use. There are two considered strategies to make sure this does not become impacting: 1. Adjust the rate limiter so the rate limit becomes endpoint specific. This would avoid the need to consider how activity on one endpoint effects another. 2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern. This commit chooses option #2. While #1 has advantages, particularly as endpoints and new use cases are added. #2 provides the strictest and safest rate limits. Our rate limits were configured to: period: 1 min avg rate: 10 burst rate: 20 In order to build a safety buffer with option #2 those allowed rates were doubled. Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added. This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests.
jentfoo
added a commit
that referenced
this issue
May 1, 2023
* Rate limit all unauthenticated HTTP endpoints This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403. Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited. The primary concern in this type of change would be if our rate limit becomes too aggressive for general use. There are two considered strategies to make sure this does not become impacting: 1. Adjust the rate limiter so the rate limit becomes endpoint specific. This would avoid the need to consider how activity on one endpoint effects another. 2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern. This commit chooses option #2. While #1 has advantages, particularly as endpoints and new use cases are added. #2 provides the strictest and safest rate limits. Our rate limits were configured to: period: 1 min avg rate: 10 burst rate: 20 In order to build a safety buffer with option #2 those allowed rates were doubled. Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added. This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests. * Add additional `High` Rate Limiting This new `High` rate limit is designed for endpoints which are only CPU bound (and thus don't have as significant of DoS risks). Initially this was motivated for `ping` and `find` due to the concern that these endpoints are used unauthenticated at login, and potential NAT's may result in very high rates from single egress IP's. In my testing on my laptop, all of these endpoints can easily get 640/req/sec on a single core within a VM. Setting the maximum of 480 burst and 120 continuous should both ensure that no single source utilizes all the CPU, as well as build in additional safety margins while providing a layer of protection. * Fix for missing error check
nick-inkeep
pushed a commit
to nick-inkeep/teleport-docs
that referenced
this issue
Jun 20, 2023
Closed
zmb3
added a commit
that referenced
this issue
Jan 6, 2024
If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted.
Closed
13 tasks
zmb3
added a commit
that referenced
this issue
Oct 28, 2024
If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted.
github-merge-queue bot
pushed a commit
that referenced
this issue
Oct 28, 2024
* Always attempt desktop discovery, even if LDAP is not ready If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted. * Periodically use the LDAP connection when discovery is not enabled If LDAP-based discovery is not enabled then we may go long periods of time without trying to use the LDAP connection, which prevents us from detecting disconnects (and restoring the connection) in a timely manner. When discovery is disabled, perform a read every 5 minutes and reconnect if we detect a connection problem. * Address review comments
github-actions bot
pushed a commit
that referenced
this issue
Oct 28, 2024
If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted.
github-actions bot
pushed a commit
that referenced
this issue
Oct 28, 2024
If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted.
github-merge-queue bot
pushed a commit
that referenced
this issue
Oct 29, 2024
* Always attempt desktop discovery, even if LDAP is not ready If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted. * Periodically use the LDAP connection when discovery is not enabled If LDAP-based discovery is not enabled then we may go long periods of time without trying to use the LDAP connection, which prevents us from detecting disconnects (and restoring the connection) in a timely manner. When discovery is disabled, perform a read every 5 minutes and reconnect if we detect a connection problem. * Address review comments * Fix some LDAP connection bugs In #36281 we made some improvements to the LDAP reconnect behavior. These changes considered the case where we had a connection to the LDAP server but then got disconnected. They did not consider the case where we never succesfully established a connection at all.
github-merge-queue bot
pushed a commit
that referenced
this issue
Nov 18, 2024
* Always attempt desktop discovery, even if LDAP is not ready If Teleport loses it's connection to the LDAP server, it will attempt to initiate a new condition when: 1. The user tries to connect to a desktop and Teleport fails to obtain the user's SID. 2. The periodic desktop discovery routine attempts to search LDAP for desktops. In some circumstances, #2 never gets the chance to apply, since discovery is skipped when LDAP is not ready. Additionally, if LDAP is not ready, then you can't connect to a desktop, so #1 can't happen either, which means Teleport won't connect again until it is restarted. * Periodically use the LDAP connection when discovery is not enabled If LDAP-based discovery is not enabled then we may go long periods of time without trying to use the LDAP connection, which prevents us from detecting disconnects (and restoring the connection) in a timely manner. When discovery is disabled, perform a read every 5 minutes and reconnect if we detect a connection problem. * Address review comments * Fix some LDAP connection bugs In #36281 we made some improvements to the LDAP reconnect behavior. These changes considered the case where we had a connection to the LDAP server but then got disconnected. They did not consider the case where we never succesfully established a connection at all. * Fix typo --------- Co-authored-by: Gus Luxton <gus@goteleport.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The prototype should have the following capabilities:
The text was updated successfully, but these errors were encountered: