RFC: Error conventions, take 3 #236

aturon · 2014-09-11T23:11:45Z

This is a conventions RFC for formalizing the basic conventions around error handling in Rust libraries.

The high-level overview is:

For catastrophic errors, abort the process or fail the task depending on whether any recovery is possible.
For contract violations, fail the task. (Recover from programmer errors at a coarse grain.)
For obstructions to the operation, use Result (or, less often, Option). (Recover from obstructions at a fine grain.)
Prefer liberal function contracts, especially if reporting errors in input values may be useful to a function's caller.

This RFC follows up on two earlier attempts by giving more leeway in when to fail the task.

lilyball · 2014-09-11T23:41:33Z

I am generally in favor of this. I am a little concerned about the naming, though. A _catch suffix is serviceable, but it gives the impression of exception handling, and more generally of "trigger the error, and then catch it", when what it's really doing is not raising the error in the first place.

I am more fond of the try_ prefix, as seen in RefCell::try_borrow(). This also matches the usage of "try" in std::task::TaskBuilder::try() and std::rc::try_unwrap() (although this latter one does not have a failing variant). The downside is the prefix try_ is also used in a couple of places to mean "don't block", such as in sync::mutex::StaticMutex::try_lock() (do we not vend this functionality from Mutex or RWLock? Strange) and std::comm::Receiver::try_recv(). FWIW, I always get confused with try_recv(), and have to check the docs to figure out if that's the "don't block" API or the "don't fail" API.

Assuming a better naming convention can be found for "don't block" APIs (_nonblock suffix?), I think try_ works best as the "don't fail" prefix.

arcto · 2014-09-12T14:48:37Z

It seems to me that it's reasonable to expect that, due to its dynamic nature, a RefCell borrow may not succeed. Therefore, I think that it should return a Result by default, and not the current inverse. The same with the other example recv. Operations that are unreliable, by their nature, should reflect this by returning a Result.

I would like the convention to take this point of view. Am I wrong?

Concretely, I'd like to reverse the recommendation of _catch so as to make these functions return a Result and have the "assertive" variants be the alternatives (_or_fail).

Why? Because the type system is not able to guarantee that the operation will succeed the users of the API should be nudged/encouraged to use the safe variant that returns a Result.

arcto · 2014-09-12T15:00:16Z

Obstructed operations must still leave the relevant data structures in a coherent state.

It's not my intention to nitpick. But would it be useful to elaborate on "coherent state"? I suppose that it means that invariants are preserved, but not requiring a complete roll-back. Thus, Obstructed operations may have side effects. To be on the safe side these invariants should be documented in the API.

arcto · 2014-09-12T15:18:28Z

Prefer expressing contracts through static types whenever possible.

I like this very much! It should be encouraged to design APIs that minimise the contract as much as possible by using specific types for input and output.

In my opinion one can often do more. That is, extend this principle to dynamic aspects. Some examples: use_file(file: File) is better than use_file(filename: str) if a File can be guaranteed to exist and be locked during its use. create_texture(size: TextureSize)can be better than create_texture(w: int, h: int) because it moves the responsibility to the caller where perhaps the compiler can guarantee and omit some of the run-time checks.

sfackler · 2014-09-12T16:54:33Z

@arcto In what context would you be able to do anything useful with an Err return value from RefCell::borrow at runtime?

nikomatsakis · 2014-09-12T18:00:32Z

@sfackler I've been in the situation where a RefCell violation would indicate a cyclic in a data structure that was supposed to be a DAG. But the data structure came from user input and hence was out of my control. So you could leverage RefCell to do the checking for you, if you wanted.

pkondzior · 2014-09-14T20:42:35Z

active/0000-error-conventions.md

+  such, calling `recv()` is an _assertion_ that the other end of the channel is
+  still alive, which will propagate failures from the other end of the
+  channel. On the other hand, since there is no separate way to atomically test
+  whether the other end has hung up, channels provide a `recv_opt` variant that


Wouldn't it be more convenient to name function recv_result instead of recv_opt if it's returning Result and not Option.

aturon · 2014-09-15T04:01:33Z

@kballard Thanks for the feedback. I've added some brief text on a few alternative names for the Result-producing variants. After thinking about it a bit more, I actually think that the exception-handling connotations of _catch may be helpful, especially for the case of channels where unwinding has already occurred (in some other task) but is being prevented from continuiting in this task. Even for RefCell, it's effectively preventing unwinding relative to the unmarked variant.

@pkondzior I added your _result suggestion as well. Certainly that's better than _opt for recv, but I'm looking for a strong general convention.

aturon · 2014-09-15T04:04:44Z

Concretely, I'd like to reverse the recommendation of _catch so as to make these functions return a Result and have the "assertive" variants be the alternatives (_or_fail).

Why? Because the type system is not able to guarantee that the operation will succeed the users of the API should be nudged/encouraged to use the safe variant that returns a Result.

If the API designer is willing to do that, they should just leave off the failing variant and let the client use .unwrap() -- which is what these conventions strongly recommend anyway. The only case where both variants are allowed is when there is a strong reason to make the failing variant the "default" (unmarked) version.

Admittedly, I'd prefer not to have this exception at all, and simply to say that if you provide a Result variant, you shouldn't provide a failing variant. But the worry is that e.g. working with channels would be become extremely verbose. (Some sugar for unwrap could certainly help.)

I am definitely curious to hear the community's feedback on this point, in particular: do people agree that it's very occasionally justified to have both variants?

sfackler · 2014-09-15T04:18:03Z

I feel like concurrency primitives like channels are really the only places where you'd want failing and non-failing variants. In the case of something like RefCell, we can have failing borrow and borrow_mut along with something like

enum BorrowState {
    Unborrowed,
    ImmutablyBorrowed(uint),
    MutablyBorrowed,
}

impl<T> RefCell<T> {
    fn state(&self) -> BorrowState { ... }
}

This avoids requiring tons of unwraps for the 99% use case where the programmer knows statically that there won't be any borrow-related failures, but still supports situations like @nikomatsakis's where you do want to handle the error case at runtime.

If we had unwrap sugar, then I'd be fine with Result-only APIs, but if that's not going to happen soon then...

aturon · 2014-09-15T04:22:33Z

@sfackler Yes! I actually like that perspective quite a bit. Concurrency primitives are special, since they are the sole way of communication between tasks and therefore the source of failure propagation. So we can give a clear guideline/justification for the variants in that case.

Unless @nikomatsakis objects, I think I will rewrite the RFC to argue for such a guideline.

sfackler · 2014-09-15T04:23:54Z

They're also special in that it isn't possible to make a state-style checking function that doesn't race.

aturon · 2014-09-15T15:16:49Z

They're also special in that it isn't possible to make a state-style checking function that doesn't race.

Yes; this argues for having a Result version, but doesn't by itself argue for also having a failing variant. I think it's cross-task failure propagation that justifies the latter (and justifies making it the default).

aturon · 2014-09-15T20:57:35Z

@nikomatsakis raised an interesting question on IRC today: is the material about allowing both failing and Result variants really just working around an ergonomic problem?

Another way of putting this: if we had something like the ! sugar, would we ever want methods with both failing and Result variants?

Or do we believe the argument that e.g. channels should "fail by default" as part of the propagation model?

sfackler · 2014-09-15T21:15:54Z

If we had ! sugar, I'd say having only Result variants would be totally reasonable.

arcto · 2014-09-16T07:56:50Z

I agree. The ! sugar would signify a possible failure-point while also keeping the code readable.

reem · 2014-09-16T08:34:26Z

However, having ! strongly encourages failing by accident and new users or inexperienced users are far more likely to just start sticking it everywhere. This RFC strongly encourages Result and Option, and ! undoes most of that work, IMO.

nikomatsakis · 2014-09-16T09:54:40Z

On Sun, Sep 14, 2014 at 09:22:34PM -0700, Aaron Turon wrote:

@sfackler Yes! I actually like that perspective quite a bit. Concurrency primitives are special, since they are the sole way of communication between tasks and therefore the source of failure propagation. So we can give a clear guideline/justification for the variants in that case.

So I went around and around on this question. I had a super long
e-mail explaining my view. Let me try to condense it.

All other things being equal, using Result return is generally
preferable to fail!:

What is recoverable and what is not is more often than not a
question of context.
- Examples: RefCell failure can be used to detect cycles.
  Receiving from a dead task is ok if you are the supervisor
  of that task. Bounds check violations may be expected.
If you do want to recover, it is generally more DRY and more
efficient to have callee do the check.

However, usability intrudes, and in reality sometimes it is better to
use fail! than Result. Ultimately this boils down to the fact that
if the vast majority of your callers would consider the error fatal,
it's annoying for them to call ensure(). Here are some conditions
that have arisen in the standard library:

- If there are a mix of errors, only some of which are frequently
  handled by the caller. In that case, the `!` and `try!` macros
  are ineffective, as @SiegeLord's example showed.
- If you *want* to bias users toward failure, as in failure
  propagation across synchronization functions.
- If the *vast, vast* majority of your users wish to fail,
  as in `RefCell`.

Now, ! sugar would change the balance, and possibly make those last
two examples not worth it. The first example remains. It seems like
people were making the case that even with ! sugar, we would want
port-receive to fail; I am unsure, but if so I guess the reasoning is
the second point.

Lastly, @sfackler proposed removing the "try" variants on RefCell
and replacing them with methods to query the current state. This is
clever, but I don't think it's a good general replacement over
Result-ful returns:

On the pro side, it helps the caller to make
choices: "ah, it's read-locked, I'll go for a read-only borrow rather
than a mut borrow". In the case of ref-cell, the utility is limited,
as there aren't that many things you can do, but in more complex cases
that could be helpful.
On the con side, it is not extensible, at least until we have some way
to declare an enum to which one may add more variants in the future.
It also has all the downsides that were listed above regarding why
returning Result is preferable. Any code that wishes to detect
failure before the case winds up duplicating the checks under which
borrow_mut may succeed -- this means that (a) the checks run
twice, to some (probably minor, but not always) performance
detriment and (b) there is the chance that the user encodes the
rules wrong, and hence winds up with a failure even though they were
trying to check. Both of these are the kinds of things that
motivated the idea of "inform the user about pre-condition
violations rather than failing outright".

nikomatsakis · 2014-09-16T12:07:45Z

On Tue, Sep 16, 2014 at 01:34:26AM -0700, Jonathan Reem wrote:

However, having ! strongly encourages failing by accident and new users or inexperienced users are far more likely to just start sticking it everywhere. This RFC strongly encourages Result and Option, and ! undoes most of that work, IMO.

I am not sure I understand. It seems to be the opposite: having !
allows us to use Result returns more often. (This has in fact been
the motivation for ! from the beginning.) Perhaps you are saying: if
we have so many common functions return Result, it cheapens its
value, as people used to "asserting" that it will not occur?

reem · 2014-09-16T12:11:45Z

I think that ! is a very dangerous, double-edged sword. On one hand it
allows us to use more Result, on the other hand it encourages and makes
easy the rather bad practice of simply ignoring said Result even in cases
where you shouldn't.

On Tuesday, September 16, 2014, Niko Matsakis notifications@github.com
wrote:

On Tue, Sep 16, 2014 at 01:34:26AM -0700, Jonathan Reem wrote:

However, having ! strongly encourages failing by accident and new
users or inexperienced users are far more likely to just start sticking it
everywhere. This RFC strongly encourages Result and Option, and ! undoes
most of that work, IMO.

I am not sure I understand. It seems to be the opposite: having !
allows us to use Result returns more often. (This has in fact been
the motivation for ! from the beginning.) Perhaps you are saying: if
we have so many common functions return Result, it cheapens its
value, as people used to "asserting" that it will not occur?

—
Reply to this email directly or view it on GitHub
#236 (comment).

Thanks,
Jonathan

arcto · 2014-09-16T12:16:59Z

I'd be very careful when placing a !, knowing that I'm not handling the error locally but allowing the task to bail out completely.

reem · 2014-09-16T12:21:12Z

You and I would be, my concern is would newbies and those coming from other
languages with looser failure semantics would do the same.

On Tuesday, September 16, 2014, arcto notifications@github.com wrote:

I'd be very careful when placing a !, knowing that I'm not handling the
error locally but allowing the task to bail out completely.

—
Reply to this email directly or view it on GitHub
#236 (comment).

Thanks,
Jonathan

vadimcn · 2014-09-27T03:49:43Z

active/0000-error-conventions.md

+
+* `try_` prefix. Also connotes exception handling, but has an unfortunately
+  overlap with the common use of `try_` for nonblocking variants (which is in
+  play for `recv` in particular).


The use of try_ in try_recv actually fits the pattern quite well: it signals a failure to receive any data. It even returns a Result.

IMHO, try_<something> makes much more sense in English that the other two forms. Also, there's the precedent of Try in C# being used for exactly the same purpose.

alexcrichton · 2014-10-30T22:33:53Z

Discussion

Tracking

tfindlow · 2014-12-08T17:37:45Z

I must admit I have a problem with "For contract violations fail the task". The erlang philosophy matches the language's target function to "support distributed, fault-tolerant, soft-real-time, non-stop applications." I understood rust's target audience to be larger than this.

My experience is that the majority of unexpected errors of a large/complex GUI application tend to be contract errors, discovered in the field simply because the complexity of the application architecture lead to dynamics that were not anticipated. Index out of bounds is not an untypical experience. To provide no other failure mechanism for Index out of bounds errors will lead to either, a) mandatory use of wrapper classes around slice, or b) making tasks very fine grained, and "catching" contract failure errors at the boundary. Both of which are doable - but fall into the category of "working around the deficiencies of the language".

Moreover; many contract errors occur in third party code for which I have no visibility, and over which I have no control. This removes (a) above as an option - leaving me only with (b). I worry about the additional cost that maintaining this fine grained task structure will impose.

Closes rust-lang#236

* Ember.String deprecation RFC * Update on RFC Update on RFC based on feedback * Fix addon name * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Minor fixes * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md typo * Update 0000-deprecation-ember-string.md (#2) * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Update 0000-deprecation-ember-string.md * Fix minor wording * Update 0000-deprecation-ember-string.md * Remove last appearance of @ember/component * rename RFC file

RFC: Error conventions, take 3

14d008c

aturon mentioned this pull request Sep 11, 2014

RFC: Error conventions #220

Closed

Clarify FIXMEs

6abf70a

nrc assigned aturon Sep 12, 2014

pkondzior reviewed Sep 14, 2014
View reviewed changes

Revision: add some clarifications and naming alternatives

8e6d015

aturon force-pushed the master branch from 4c0bebf to b1d1bfd Compare September 16, 2014 22:46

aturon mentioned this pull request Sep 22, 2014

Add to_c_str_opt/with_c_str_opt functions and fix an ICE rust-lang/rust#17435

Closed

vadimcn reviewed Sep 27, 2014
View reviewed changes

Clarifications.

aee1de5

alexcrichton force-pushed the master branch from b9e2b8c to 5020131 Compare October 29, 2014 15:25

alexcrichton mentioned this pull request Oct 30, 2014

Ensure libraries enforce correct error conventions rust-lang/rust#18466

Closed

alexcrichton merged commit aee1de5 into rust-lang:master Oct 30, 2014

aturon mentioned this pull request Jan 2, 2015

Second pass stabilization: slice and vec rust-lang/rust#20061

Closed

This was referenced Jul 5, 2016

Modify special functions to return Result statrs-dev/statrs#1

Closed

Remove unchecked versions of distributions statrs-dev/statrs#2

Closed

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Add AsyncSync::{is_not_ready, is_ready}

62eda1e

Closes rust-lang#236

chriskrycho mentioned this pull request Feb 18, 2017

Document all features in the reference rust-lang/rust#38643

Closed

17 tasks

chriskrycho mentioned this pull request Mar 11, 2017

Document all features rust-lang/reference#9

Closed

48 tasks

Centril added A-panic Panics related proposals & ideas A-error-handling Proposals relating to error handling. A-convention Proposals relating to documentation conventions. labels Nov 23, 2018

leudz mentioned this pull request Apr 29, 2020

Panic feature flag leudz/shipyard#80

Closed

chorman0773 mentioned this pull request Nov 15, 2020

add const-ub RFC #3016

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Error conventions, take 3 #236

RFC: Error conventions, take 3 #236

aturon commented Sep 11, 2014

lilyball commented Sep 11, 2014

arcto commented Sep 12, 2014

arcto commented Sep 12, 2014

arcto commented Sep 12, 2014

sfackler commented Sep 12, 2014

nikomatsakis commented Sep 12, 2014

pkondzior Sep 14, 2014

aturon commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

aturon commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

arcto commented Sep 16, 2014

reem commented Sep 16, 2014

nikomatsakis commented Sep 16, 2014

nikomatsakis commented Sep 16, 2014

reem commented Sep 16, 2014

arcto commented Sep 16, 2014

reem commented Sep 16, 2014

vadimcn Sep 27, 2014

alexcrichton commented Oct 30, 2014

tfindlow commented Dec 8, 2014

RFC: Error conventions, take 3 #236

RFC: Error conventions, take 3 #236

Conversation

aturon commented Sep 11, 2014

lilyball commented Sep 11, 2014

arcto commented Sep 12, 2014

arcto commented Sep 12, 2014

arcto commented Sep 12, 2014

sfackler commented Sep 12, 2014

nikomatsakis commented Sep 12, 2014

pkondzior Sep 14, 2014

Choose a reason for hiding this comment

aturon commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

aturon commented Sep 15, 2014

aturon commented Sep 15, 2014

sfackler commented Sep 15, 2014

arcto commented Sep 16, 2014

reem commented Sep 16, 2014

nikomatsakis commented Sep 16, 2014

nikomatsakis commented Sep 16, 2014

reem commented Sep 16, 2014

arcto commented Sep 16, 2014

reem commented Sep 16, 2014

vadimcn Sep 27, 2014

Choose a reason for hiding this comment

alexcrichton commented Oct 30, 2014

tfindlow commented Dec 8, 2014