Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceeding request_body's max_size causes unexpected rejection behavior #4558

Closed
kellytk opened this issue Feb 3, 2022 · 16 comments
Closed
Labels
documentation 📚 Improvements or additions to documentation

Comments

@kellytk
Copy link

kellytk commented Feb 3, 2022

Caddy version: 2.4.6

The documentation does not state how it's handled when a request exceeds max_size. I would expect 413 Payload Too Large however 502 Bad Gateway is how Caddy responds.

May I ask, if that is expected behavior, what the rationale is for using that code?

@francislavoie
Copy link
Member

Hmm. Do you have evidence in your logs that that's the response code it writes? I'm not convinced that Caddy itself sends 502.

Basically, request_body max_size is using Golang's built-in MaxBytesReader wrapper, which apparently just returns an error errors.New("http: request body too large").

Caddy is not specifically handling this error (and it's kinda tricky to because it would involve a string comparison to check it, because it's not a specific error type), so it'll just trigger Caddy's default error handling behaviour. You can use handle_errors to customize what you do with it, if you want. You can use the expression matcher to match against the error having that message with the {http.error.message} placeholder. Maybe something like this:

handle_errors {
	@tooLarge expression `{http.error.message} == "http: request body too large"`
	handle @tooLarge {
		respond "Payload too large" 413
	}
}

@mholt mholt added the needs info 📭 Requires more information label Feb 3, 2022
@mholt
Copy link
Member

mholt commented Feb 3, 2022

Thanks for opening an issue! We'll look into this.

It's not immediately clear to me what is going on, so I'll need your help to understand it better.

Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.

I've attached a template below that will help make this easier and faster! This will require some effort on your part -- please understand that we will be dedicating time to fix the bug you are reporting if you can just help us understand it and reproduce it easily.

This template will ask for some information you've already provided; that's OK, just fill it out the best you can. 👍 I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```


### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```


### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```


## 2. Description

### 2a. What happens (briefly explain what is wrong)




### 2b. Why it's a bug (if it's not obvious)




### 2c. Log output

```
paste terminal output or logs here
```



### 2d. Workaround(s)




### 2e. Relevant links




## 3. Tutorial (minimal steps to reproduce the bug)




Instructions -- please heed otherwise we cannot help you (help us help you!)

  1. Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.

  2. Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.

    • 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
    • 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
    • 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
  3. Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:

    • Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
    • Do not run Caddy inside containers.
    • Run Caddy manually in your terminal; do not use systemd or other init systems.
    • If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
    • Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
    • Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file:
{ ... }

Open terminal and run Caddy:

$ caddy ...

Make an HTTP request:

$ curl ...

Notice that the result is ___ but it should be ___.

@kellytk
Copy link
Author

kellytk commented Feb 3, 2022

@francislavoie That's what's reported in the browser console. I'll investigate further. Is the expected behavior to return 413?

@francislavoie
Copy link
Member

That's what's reported in the browser console. I'll investigate further.

Please do. What's in Caddy's logs? Turn on the debug global option to get more details. Turn on the log directive in your site to enable access logging.

Is the expected behavior to return 413?

No, there's no "expected behaviour", really. Like I said, there is no specific HTTP status code attached to MaxBytesReader, it just emits an Golang error, which you can then do whatever you like with in handle_errors.

@kellytk
Copy link
Author

kellytk commented Feb 3, 2022

@francislavoie Testing with curl I continue to receive a 502 response. That's also the code used in the generated Caddy log error. The request is handled by an app via reverse proxying through a *nix socket.

@kellytk kellytk closed this as completed Feb 3, 2022
@francislavoie
Copy link
Member

Ah, okay I understand what's going on then. So since it's a reader wrapper, basically the thing that invokes reading the request body is what will receive the error. So the reverse_proxy module is what reads the body so that it can copy it to the upstream, but it encounters that error. And I think that reverse_proxy code assumes it's a problem with the upstream when it gets any kind of error, and writes a 502.

I don't think we can really do much about this unless we either get the Go stdlib to change the error it emits to something we can do a type assertion on to change the response code, or we re-implement the wrapper in Caddy itself to do that. It's a really simple wrapper, so we could just do that. 🤷‍♂️

@kellytk
Copy link
Author

kellytk commented Feb 3, 2022

@francislavoie Thank you for the insight. It would be great for Caddy's wrapper to be enhanced. If it's added, I'd be available to test if needed.

In the meantime, I'll handle the request size limit in the app.

@kellytk
Copy link
Author

kellytk commented Feb 4, 2022

@francislavoie Would the wrapper enhancement use Content-Length to discern size or a superior approach hopefully?

@francislavoie
Copy link
Member

francislavoie commented Feb 4, 2022

Digging into the code a bit more, it's not possible to yoink out the code from stdlib since it uses private/unexported functions to manipulate the HTTP response state.

There is an open issue golang/go#30715 which is relevant here, if it were implemented (which it seems it will be, the proposal was accepted), we could add a check for this specific error to write a specific status code.

So I'll keep this issue open and mark it as a feature request that requires an upstream change.

In the meantime, I'll handle the request size limit in the app.

Have you tried my suggestion of handling it with handle_errors? Wouldn't that be easier?

Would the wrapper enhancement use Content-Length to discern size or a superior approach hopefully?

No, that's not a reliable way to do it. Clients (bad actors) can lie and send an invalid Content-Length. The right way is to actually prevent reading past a certain limit.

@francislavoie francislavoie reopened this Feb 4, 2022
@francislavoie francislavoie added feature ⚙️ New feature or request upstream ⬆️ Relates to some dependency of this project and removed needs info 📭 Requires more information labels Feb 4, 2022
@kellytk
Copy link
Author

kellytk commented Feb 4, 2022

Thank you once again for the excellent information.

Have you tried my suggestion of handling it with handle_errors?

Frankly no. Respectfully, IMHO the Caddy docs aren't very good, so I don't have confidence I'd be able to figure out how to use it in the amount of time I would have budgeted for the task, and I'd rather not clog this issue up with only tangentially related questions.

If you believe it would be helpful to others until the Go enhancement is in place, and you wouldn't mind implementation questions here or in another issue, say the word and I'll give it a... go.

No, that's not a reliable way to do it. Clients (bad actors) can lie and send an invalid Content-Length.

Exactly!

@francislavoie
Copy link
Member

Digging into this a bit deeper again. I tried out my suggested handle_errors approach. It won't really work, because how MaxBytesReader works is that it will stop reading from the client, and attempt to write a response, then close the connection. Since most clients won't stop writing their request until they're done, they'll never see the response before the connection is closed.

So even if we could change the response code, the client would never see it. Quick google search shows that's intended and the only way it can actually work to stop receiving data from the client. See https://stackoverflow.com/questions/35173819/request-body-too-large-causing-connection-reset-in-go for example.

This is how I tested it:

Caddyfile:

{
	admin off
	debug
}

:8881 {
	request_body {
		max_size 1kb
	}

	log

	reverse_proxy localhost:8882

	handle_errors {
		respond "{http.error.message}" 413
	}
}

:8882 {
	log
	respond {http.request.body}
}

Send a curl request with a big file, see the result of a closed connection:

$ curl -v --data-binary "@bigfile.txt" http://localhost:8881
* Rebuilt URL to: http://localhost:8881/
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8881 (#0)
> POST / HTTP/1.1
> Host: localhost:8881
> User-Agent: curl/7.55.1
> Accept: */*
> Content-Length: 47120384
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* Send failure: Connection was reset
* Closing connection 0
curl: (55) Send failure: Connection was reset

In Caddy's logs:

2022/02/04 14:12:42.403 INFO    http.log.access handled request {"request": {"remote_ip": "::1", "remote_port": "50207", "proto": "HTTP/1.1", "method": "POST", "host": "localhost:8881", "uri": "/", "headers": {"User-Agent": ["curl/7.55.1"], "Content-Length": ["47120384"], "Accept": ["*/*"], "Expect": ["100-continue"], "X-Forwarded-For": ["::1"], "Accept-Encoding": ["gzip"], "Content-Type": ["application/x-www-form-urlencoded"], "X-Forwarded-Proto": ["http"]}}, "user_id": "", "duration": 0.011588, "size": 0, "status": 200, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}
2022/02/04 14:12:42.403 DEBUG   http.handlers.reverse_proxy     upstream roundtrip      {"upstream": "localhost:8882", "duration": 0.0129982, "request": {"remote_ip": "::1", "remote_port": "50206", "proto": "HTTP/1.1", "method": "POST", "host": "localhost:8881", "uri": "/", "headers": {"X-Forwarded-Proto": ["http"], "X-Forwarded-For": ["::1"], "User-Agent": ["curl/7.55.1"], "Accept": ["*/*"], "Content-Length": ["47120384"], "Content-Type": ["application/x-www-form-urlencoded"], "Expect": ["100-continue"]}}, "error": "readfrom tcp [::1]:50207->[::1]:8882: http: request body too large"}
2022/02/04 14:12:42.404 ERROR   http.log.error  readfrom tcp [::1]:50207->[::1]:8882: http: request body too large      {"request": {"remote_ip": "::1", "remote_port": "50206", "proto": "HTTP/1.1", "method": "POST", "host": "localhost:8881", "uri": "/", "headers": {"User-Agent": ["curl/7.55.1"], "Accept": ["*/*"], "Content-Length": ["47120384"], "Content-Type": ["application/x-www-form-urlencoded"], "Expect": ["100-continue"]}}, "duration": 0.0129982, "status": 502, "err_id": "c7i5rsed9", "err_trace": "reverseproxy.statusError (reverseproxy.go:893)"}
2022/02/04 14:12:42.404 ERROR   http.log.access handled request {"request": {"remote_ip": "::1", "remote_port": "50206", "proto": "HTTP/1.1", "method": "POST", "host": "localhost:8881", "uri": "/", "headers": {"User-Agent": ["curl/7.55.1"], "Accept": ["*/*"], "Content-Length": ["47120384"], "Content-Type": ["application/x-www-form-urlencoded"], "Expect": ["100-continue"]}}, "user_id": "", "duration": 0.0129982, "size": 20, "status": 413, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}

First line is the :8882 server receiving the request (access logs happen after the request is completed, upstream finishes before the proxy does), second line is the :8881 proxy performing the roundtrip (and logging how long it took), third line is the error being logged by the :8881 HTTP server before passing it onto the handle_errors routes, and the last line is the access log from :8881 showing the final request and "response".

Notice that the last log line does show that Caddy attempts to write status 413 as per handle_errors but the client never sees it.


I think all that we can really do is flesh out the docs on https://caddyserver.com/docs/caddyfile/directives/request_body to better explain how it will behave, clarifying that the client connection will be closed if too many bytes are tried to be read from the request.

@francislavoie francislavoie added documentation 📚 Improvements or additions to documentation and removed feature ⚙️ New feature or request upstream ⬆️ Relates to some dependency of this project labels Feb 4, 2022
@kellytk
Copy link
Author

kellytk commented Feb 5, 2022

@francislavoie Does the block

handle_errors {
    respond "{http.error.message}" 413
}

configure Caddy to return 413 for all HTTP request errors, potentially including some unrelated to exceeding request max size?

@mholt
Copy link
Member

mholt commented Mar 11, 2022

Caddy v1 had a special "limits" handler that used a maxBytesReader to reject large body sizes. Do we need to do something like this in v2? https://github.com/caddyserver/caddy/blob/v1/caddyhttp/limits/handler.go

@francislavoie
Copy link
Member

@mholt we already do this https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/requestbody/requestbody.go

The code in the v1 branch is a copy of the stdlib, but I don't think it actually works correctly because res.requestTooLarge() can't be called outside of stdlib, cause it's not exported. Pretty sure, anyways.

@mholt
Copy link
Member

mholt commented Mar 11, 2022

Ah, sorry, I misunderstood this issue. I do think this could be handled better in the code though too. Maybe we can somehow wrap the returned error from MaxBytesReader with one of our structured errors that adds status code.

@mholt mholt closed this as completed in 3d616e8 Mar 11, 2022
@mholt
Copy link
Member

mholt commented Mar 11, 2022

I verified that 3d616e8 will now return HTTP 413 from Read() if too many bytes are read. I also improved error wrapping generally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation 📚 Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants