-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic within a handler should return an HTTP 500 #1501
Comments
I would argue that terminating a connection is a suitable response to a handler panic. You should endevor to employ proper error handling at all levels. Things that return a Result do so because their errors are recoverable and should be either handled or bubbled. Using On catch_unwind, trying to employ a system that relies on catch_unwind has all sorts of gotchas including considerations of things that are or are not unwind-safe; it is not equivalent to try-catch in other languages that are built to handle clean up. |
I appreciate that However, I'm also not suggesting that this is something to do lightly. You're absolutely right that returning a Specifically I am talking about things that really are catastrophic failures. Things like talking to a database that just isn't there. Not reading a database record that is not found, but the entire database being inaccessible. And yes, this can be done using
Right now, this works great. And the only way this can at all fail is in a catastrophic sense. If the database is missing, or the schema is corrupt. Anything else will cause a graceful result to be returned and all is good. (This design was based on reading around in blog posts, the Rust Book and The Rustonomicon.) Now, I could return a As such, it seemed reasonable to have this level of handling be something that was more common and shared so that everyone can benefit automatically. :) Since these are the level of errors that means that everything is broken anyway, I'm not totally invested in whether it is decided to go ahead with this or not. If you still deem it better to leave it alone then you are significantly more experienced than I am and I bow to that wisdom :) Cheers |
I am writing my code without using any panics but some external libraries are using panics in some cases so returning 500 if it happens should be better idea than just closing connection. I will research this possibility (https://docs.rs/futures/0.3.5/futures/future/trait.FutureExt.html#method.catch_unwind) this weekend and maybe create my first PR. |
Anyone from @actix/contributors got experiences or opinions on handler panicking? |
I'm sorry but I don't suppose this proposal. Rust offers first-class error handling to manage this situation. Impress upon the authors of 3rd party libraries to manage errors rather than panic. |
@sazzer if you want the feeling of throwing from functions this crate gets you pretty close to the syntax: https://docs.rs/fehler/1.0.0/fehler/attr.throws.htm |
catch unwind is not possible with actix, period. |
@robjtede It's not that I want exceptions or throws. I actually like the What I wasn't keen on was the need to do that for error conditions that are truly catastrophic, and that wouldn't need to have a For example, my case above with "lookup_username". Assuming that the database is present, the network is working and the schema is valid, this can not fail. Putting in multiple levels of However, based on the comment from @fafhrd91 that turned up as I was typing this, it's all a non-starter anyway so I'm going to shut up now :) Cheers |
Exist another very simple reason to support this. Is the expected behavior. Specially on development, where the browser IS the main interface. In any other web framework I have used (in .net, python, delphi,...) is how things are done. To the point that when this happened to me, I was wonder why?? Is something I haven't done?. Exist many quality-of-life stuff that come for "free" in things like https://flask.palletsprojects.com/ that in actix and other rust frameworks are "missing". I'm totally 100% in agreement that this must be done with correctness in mind. This is what I have bet on rust for rewrite a very large .net web backend, and the payoff is very good. But also I think that provide this small touches (maybe as additions or as part of "extras" crate or something like) will make rust even more attractive to the crow of python/.net/ruby... |
This is not matter of new feature. This is just not possible |
@fafhrd91 Could you explain us why catch_unwind don't work for actix so we all can learn something? Just curious. |
Did you check definition? |
I was about to write a thoughtful response to this, but #1501 (comment) pretty much sums up my point of view. Signaling failure with Result is a win-win and is well supported. 90% of my error handling code comes in the form of an Using things like |
From
From
From the Rust book ch 9.3:
From the edition guide:
Fundamentally, panicking is Rust's solution to prevent memory errors/corruption and generally prevent invalid states in situations where recovery is not possible. Such cases are exceeding rare in HTTP request handling, therefore Result should be used where possible. The "isolation boundary" should be well defined and not let states become invalid. In the case of actix-web, since workers are naturally isolated by their arbiter/thread, it is sensible for that boundary to be the thread and nothing more granular. |
It is simpler, Rc and Arc are not catch unwind |
Rare? I wish. Loss database connections, the database schema change, the upstream server/apis break (my world, all the time), etc is not rare at ALL. It happens in production and is handled corrected by ANY decent web server or router or whatever in the stack. As I say, this lack of feature is VERY surprising to me and I think anyone that come from any other web framework below the sun. Where else this is like this? I get if this in rust is harder. So what else can be done? Is possible to wrap the code, fork it then handle the crash in the fork or something like that? Actix MUST be put behind another web server like nginx for production? If is the case, put the idea in the docs is important (not assume everyone put nginx or similar in front of their APIs). If rust is mean to be better at handle problems, then we can't just say "let it crash" and not respond with the proper error 500 just because this. As I say, if a workaround must be implemented, is ok. But "blind" crash is not good. |
Then you need to fix dB connector or api access code. Panic is not exception, you can have a lot of bad consequences after panic. For example leaked memory. It is better to use “panic = abort” in your cargo.toml |
Yeah, exactly. For all intents and purposes, panics just shouldn't happen in production systems, least of all due to an If you're in a situation where some code you're calling panics without giving you the opportunity to handle the error (as a Result), I'd look to report it as a bug in whatever that library is. |
Sure, but when the problem happen, the user get a "connection reset", so the tech support blame it on the internet or something (totally believable on my case, my users roam the country) and not report it to me, instead of a "error 500: Internal server error" where now I sure the problem is in the server. Again, this behavior is not what is expected. So look like I need to workaround to put things behind nginx (that look like a non issue for regular websites but are another complexity for deploy intranet things). |
If you put rust code with panics into production then your have to use nginx, no options. |
That is fair, is something good to know. Thanks for the help. |
These are the exceedingly rare cases referred to in the previous sentence. Recoverable errors are obviously commonplace.
These are all recoverable. |
This would not happen if your handler returned a |
I still think it's better UX for something like this to return an Error 500 to the user if the invariant is broken: thumb_path.push(format!("{:x}.png", md5::compute(
Url::from_file_path(path).expect("path should be absolute by this point").as_str()))); (It's MD5 because it's from an in-development implementation of the XDG Thumbnail Managing Standard for a miniserve-like tool for quickly sharing image sets with friends... a tool where the whole point is that you don't need to set up something like nginx.) |
actix seems barely affected by panics:
So far so good, but the issue with this is that if actix is behind nginx, the current behaviour of actix causes nginx to log these errors:
which (by default) causes nginx to mark the upstream down for ten seconds, the same as it would if the upstream was actually down. This causes nginx to not serve requests to the server anymore even though it is still up. This causes a DOS because one panicking request handler effectively takes the server down for 10s . Sadly this behaviour isn't even cleanly configurable in nginx. So I'm not sure about the exact implications, but it would be great if actix would not instantly close the connection if one of my dependencies does something dumb. I don't want a configurable error message, just a 500 response. |
Reluctantly, I created a If you're interested in reporting panics and not catching them then the alternative As for the built-in behavior of Actix Web, our view about making this behavior default is extremely unlikely to change unless the project maintainership changes significantly. |
(Apologies if this is already covered anywhere. I did look and couldn't find anything, but I may very well have missed it!)
Expected Behavior
If anything goes catastrophically wrong within a handler, it would be good to return an HTTP 500 response to the client.
Even better would be the option to customize what the error response is.
Current Behavior
Currently, the server just hangs up the network connection.
It does not seem to break the threading - I can make many more panicking requests to the server than it is running threads and they all connect just fine. I just don't get a decent error.
Steps to Reproduce (for bugs)
unwrap()
on an Err value from within the handlerContext
There are certain classes of errors that are catastrophic in nature. For example, the database not being present. Adding deliberate error propagation for these scenarios only serves to bloat the codebase - all of a sudden I've got to propagate the errors all the way from the bottom layer up to the handlers in order to return an error.
Alternatively, I can just not care about these. Just call
pool.get().await.unwrap()
to get a connection from the connection pool, for example. If I get a connection then that's great. If I don't then I have a catastrophic failure on my hands anyway, so panicking isn't unreasonable.I could wrap every single handler in
panic::catch_unwind()
and handle it myself, but this is then a lot of duplication of effort, and has the need to return appropriate catastrophic errors from every handler. (I'm also not sure how to usecatch_unwind
in an async context, but that's my problem :) )Alternatively, if Actix-web can do this then it only needs to be done once, and can have guaranteed consistent responses no matter which handler had the problem.
Obviously, this should only apply to actual catastrophic failures. Business errors, validation errors, etc are the domain of the application and not of Actix-web and should remain as such.
Your Environment
rustc -V
): rustc 1.45.0-nightly (a08c47310 2020-05-07)The text was updated successfully, but these errors were encountered: