Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teleport does not return usable errors #172

Closed
kontsevoy opened this issue Feb 27, 2016 · 3 comments
Closed

Teleport does not return usable errors #172

kontsevoy opened this issue Feb 27, 2016 · 3 comments
Assignees

Comments

@kontsevoy
Copy link
Contributor

We need a better error story. Something quick, cheap and user-friendly. Currently our error messages are basically concatenated log entries.

Example

My teleport server already has a user named ekontsevoy. When I try to invite myself again:

> tctl users invite ekontsevoy

I get this back printed into console:

edsger ~/go/src/github.com/gravitational/teleport: out/tctl users invite ekontsevoy
ERROR: bad parameter 'errorcode', unrecognized http error: <nil> 500 {"path":"/home/ekontsevoy/go/src/github.com/gravitational/teleport/lib/auth/srv.go","func":"github.com/gravitational/teleport/lib/auth.(*APIServer).createSignupToken","line":644,"Message":""}

A number of issues here:

  • No need to show source files and line numbers. End users do not care.
  • Only high level error needs to be shown, not the "guts" of the problem.
  • In this case something else entirely is going on though

Desired outcome

Clean error messages like:

  • user foo already exists
  • user foo not found
@kontsevoy
Copy link
Contributor Author

Also, something has changed in trace implementation and unwrap() function no longer works in lib/utils/cli.go. (trace.Error.OrigError() points to itself (!) leading to endless loop potential) I am leaving this as-is since changing it should probably conform to whatever error reporting mechanism you decide to use.

My guess it's going to be something similar you're doing with Web UI

@kontsevoy
Copy link
Contributor Author

One more example. I am logging in via tsh using a fresh database (no users) and I expect a typical nice "unrecognized username/password error" without any references to the source code.

Instead, on a server I see:

ERRO[0542] Password auth error: bucket web not found     file=auth/tun.go:350

On the client:

ERRO[0004] bad parameter 'errorcode', unrecognized http error: <nil> 500 {"path":"/home/ekontsevoy/go/src/github.com/gravitational/teleport/lib/web/web.go","func":"github.com/gravitational/teleport/lib/web.(*Handler).createSSHCert","line":373,"Message":""}  file=client/api.go:372

@kontsevoy
Copy link
Contributor Author

Another example. I am connecting to a non-existing server. Expecting a nice message "cannot locate remote host XXX" (actually, why wouldn't proxy try to connect to it anyway? this is a subject of another ticket)

Instead I see this in the server logs:

INFO[1151] close session request: sess(127.0.0.1:48118->127.0.0.1:3023, user=ekontsevoy, id=6) failed to execute request, err: [srv/proxy.go:74]  server XXX:3022 not found  component=proxy fields=map[local:127.0.0.1:3023 remote:127.0.0.1:48118 user:ekontsevoy id:6] file=srv/srv.go:498

On the client:

ERROR: [client/api.go:157]  [client/client.go:227]  ssh: handshake failed: EOF

klizhentas added a commit that referenced this issue Mar 8, 2016
kimlisa pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6
kimlisa pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6
alex-kovoy pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6
kimlisa pushed a commit that referenced this issue Oct 12, 2020
760f8fd [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@760f8fd
kimlisa pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6 \n -w master -t lisa/test
kimlisa pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6 <br><br>
kimlisa pushed a commit that referenced this issue Oct 12, 2020
435fea6 [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@435fea6 <br><br> -w master -t lisa/test
alex-kovoy pushed a commit that referenced this issue Oct 12, 2020
760f8fd [teleport] Add session.reject, trusted_cluster.create/delete events (#172) gravitational/webapps@760f8fd
jentfoo added a commit that referenced this issue Apr 14, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403.

Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited.

The primary concern in this type of change would be if our rate limit becomes too aggressive for general use.  There are two considered strategies to make sure this does not become impacting:
1. Adjust the rate limiter so the rate limit becomes endpoint specific.  This would avoid the need to consider how activity on one endpoint effects another.
2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern.

This commit chooses option #2.  While #1 has advantages, particularly as endpoints and new use cases are added.  #2 provides the strictest and safest rate limits.

Our rate limits were configured to:
period: 1 min
avg rate: 10
burst rate: 20

In order to build a safety buffer with option #2 those allowed rates were doubled.
jentfoo added a commit that referenced this issue Apr 18, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403.

Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited.

The primary concern in this type of change would be if our rate limit becomes too aggressive for general use.  There are two considered strategies to make sure this does not become impacting:
1. Adjust the rate limiter so the rate limit becomes endpoint specific.  This would avoid the need to consider how activity on one endpoint effects another.
2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern.

This commit chooses option #2.  While #1 has advantages, particularly as endpoints and new use cases are added.  #2 provides the strictest and safest rate limits.

Our rate limits were configured to:
period: 1 min
avg rate: 10
burst rate: 20

In order to build a safety buffer with option #2 those allowed rates were doubled.
jentfoo added a commit that referenced this issue Apr 25, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403.

Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited.

The primary concern in this type of change would be if our rate limit becomes too aggressive for general use.  There are two considered strategies to make sure this does not become impacting:
1. Adjust the rate limiter so the rate limit becomes endpoint specific.  This would avoid the need to consider how activity on one endpoint effects another.
2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern.

This commit chooses option #2.  While #1 has advantages, particularly as endpoints and new use cases are added.  #2 provides the strictest and safest rate limits.

Our rate limits were configured to:
period: 1 min
avg rate: 10
burst rate: 20

In order to build a safety buffer with option #2 those allowed rates were doubled.

Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added.  This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests.
jentfoo added a commit that referenced this issue Apr 27, 2023
This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403.

Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited.

The primary concern in this type of change would be if our rate limit becomes too aggressive for general use.  There are two considered strategies to make sure this does not become impacting:
1. Adjust the rate limiter so the rate limit becomes endpoint specific.  This would avoid the need to consider how activity on one endpoint effects another.
2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern.

This commit chooses option #2.  While #1 has advantages, particularly as endpoints and new use cases are added.  #2 provides the strictest and safest rate limits.

Our rate limits were configured to:
period: 1 min
avg rate: 10
burst rate: 20

In order to build a safety buffer with option #2 those allowed rates were doubled.

Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added.  This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests.
jentfoo added a commit that referenced this issue May 1, 2023
* Rate limit all unauthenticated HTTP endpoints

This commit is an extension to what was done in #172. And is designed to fix #4330 and https://github.com/gravitational/teleport-private/issues/403.

Rather than audit endpoints and choose what endpoints should be rate limited, this commit proposes that for safety and reduced cognitive load, all unauthenticated endpoints become rate limited.

The primary concern in this type of change would be if our rate limit becomes too aggressive for general use.  There are two considered strategies to make sure this does not become impacting:
1. Adjust the rate limiter so the rate limit becomes endpoint specific.  This would avoid the need to consider how activity on one endpoint effects another.
2. Accept that rate limit interactions are possible and instead ensure rate limits are high enough to avoid this concern.

This commit chooses option #2.  While #1 has advantages, particularly as endpoints and new use cases are added.  #2 provides the strictest and safest rate limits.

Our rate limits were configured to:
period: 1 min
avg rate: 10
burst rate: 20

In order to build a safety buffer with option #2 those allowed rates were doubled.

Additionally the ability to avoid rate limits by authenticating your request (even if the endpoint is otherwise unauthenticated) was added.  This is particularly useful for the `ping` endpoint which may have high levels of activity on large clusters, but which has a portion of that activity over authenticated requests.

* Add additional `High` Rate Limiting

This new `High` rate limit is designed for endpoints which are only CPU bound (and thus don't have as significant of DoS risks).
Initially this was motivated for `ping` and `find` due to the concern that these endpoints are used unauthenticated at login, and potential NAT's may result in very high rates from single egress IP's.
In my testing on my laptop, all of these endpoints can easily get 640/req/sec on a single core within a VM.  Setting the maximum of 480 burst and 120 continuous should both ensure that no single source utilizes all the CPU, as well as build in additional safety margins while providing a layer of protection.

* Fix for missing error check
nick-inkeep pushed a commit to nick-inkeep/teleport-docs that referenced this issue Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants