From 14d008c7702a633267867002b1c2586b177a18c7 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 11 Sep 2014 16:03:10 -0700 Subject: [PATCH 1/4] RFC: Error conventions, take 3 --- active/0000-error-conventions.md | 203 +++++++++++++++++++++++++++++++ 1 file changed, 203 insertions(+) create mode 100644 active/0000-error-conventions.md diff --git a/active/0000-error-conventions.md b/active/0000-error-conventions.md new file mode 100644 index 00000000000..5b53bf53016 --- /dev/null +++ b/active/0000-error-conventions.md @@ -0,0 +1,203 @@ +- Start Date: (fill me in with today's date, 2014-09-11) +- RFC PR #: (leave this empty) +- Rust Issue #: (leave this empty) + +# Summary + +This is a *conventions* RFC for formalizing the basic conventions around error +handling in Rust libraries. + +The high-level overview is: + +* For *catastrophic errors*, abort the process or fail the task depending on + whether any recovery is possible. + +* For *contract violations*, fail the task. (Recover from programmer errors at a coarse grain.) + +* For *obstructions to the operation*, use `Result` (or, less often, + `Option`). (Recover from obstructions at a fine grain.) + +* Prefer liberal function contracts, especially if reporting errors in input + values may be useful to a function's caller. + +This RFC follows up on [two](https://github.com/rust-lang/rfcs/pull/204) +[earlier](https://github.com/rust-lang/rfcs/pull/220) attempts by giving more +leeway in when to fail the task. + +# Motivation + +Rust provides two basic strategies for dealing with errors: + +* *Task failure*, which unwinds to at least the task boundary, and by default + propagates to other tasks through poisoned channels and mutexes. Task failure + works well for coarse-grained error handling. + +* *The Result type*, which allows functions to signal error conditions through + the value that they return. Together with a lint and the `try!` macro, + `Result` works well for fine-grained error handling. + +However, while there have been some general trends in the usage of the two +handling mechanisms, we need to have formal guidelines in order to ensure +consistency as we stabilize library APIs. That is the purpose of this RFC. + +For the most part, the RFC proposes guidelines that are already followed today, +but it tries to motivate and clarify them. + +# Detailed design + +Errors fall into one of three categories: + +* Catastrophic errors, e.g. out of memory. +* Contract violations, e.g. wrong input encoding, index out of bounds. +* Obstructions, e.g. file not found, parse error. + +The basic principle of the conventions is that: + +* Catastrophic errors and programming errors (bugs) can and should only be +recovered at a *coarse grain*, i.e. a task boundary. +* Obstructions preventing an operation should be reported at a maximally *fine +grain* -- to the immediate invoker of the operation. + +## Catastrophic errors + +An error is _catastrophic_ if there is no meaningful way for the current task to +continue after the error occurs. + +Catastrophic errors are _extremely_ rare, especially outside of `libstd`. + +**Canonical examples**: out of memory, stack overflow. + +### For catastrophic errors, abort the process or fail the task. + +For errors like out-of-memory, even correctly unwinding is impossible (since it +may require allocation); the process is aborted. + +For errors like stack overflow, Rust currently aborts the process, but could in +principle fail the task, which would allow reporting and recovery from a +supervisory task. + +## Contract violations + +An API may define a contract that goes beyond the type checking enforced by the +compiler. For example, slices support an indexing operation, with the contract +that the supplied index must be in bounds. + +Contracts can be complex. For example, the `RefCell` type requires that +`borrow_mut` not be called until all existing borrows have been relinquished. + +### For contract violations, fail the task. + +A contract violation is always a bug, and for bugs we follow the Erlang +philosophy of "let it crash": we assume that software *will* have bugs, and we +design coarse-grained task boundaries to report, and perhaps recover, from these +bugs. + +### Contract design + +One subtle aspect of these guidelines is that the contract for a function is +chosen by an API designer -- and so the designer also determines what counts as +a violation. + +This RFC does not attempt to give hard-and-fast rules for designing +contracts. However, here are some rough guidelines: + +* Prefer expressing contracts through static types whenever possible. + +* It *must* be possible to write code that uses the API without violating the + contract. + +* Contracts are most justified when violations are *inarguably* bugs -- but this + is surprisingly rare. + +* Consider whether the API client could benefit from the contract-checking + logic. The checks may be expensive. Or there may be useful programming + patterns where the client does not want to check inputs before hand, but would + rather attempt the operation and then find out whether the inputs were invalid. + +* When in doubt, use loose contracts and instead return a `Result` or `Option`. + +## Obstructions + +An operation is *obstructed* if it cannot be completed for some reason, even +though the operation's contract has been satisfied. Obstructed operations must +still leave the relevant data structures in a coherent state. + +Obstructions may involve external conditions (e.g., I/O), or they may involve +aspects of the input that are not covered by the contract. + +**Canonical examples**: file not found, parse error. + +### For obstructions, use `Result` (or `Option`). + +The +[`Result` type](http://static.rust-lang.org/doc/master/std/result/index.html) +represents either a success (yielding `T`) or failure (yielding `E`). By +returning a `Result`, a function allows its clients to discover and react to +obstructions in a fine-grained way. + +If there are multiple ways an operation might be obstructed, or there is useful +information about the obstruction (such as where in the input a parse error +occurred), prefer to use `Result`. For operations with a single obvious +obstruction (like popping from an empty stack), use `Option`. + +(Currently, `Option` does not interact well with other error-handling +infrastructure like `try!`, but this will likely be improved in the future.) + +## Do not provide both `Result` and `fail!` variants. + +An API should not provide both `Result`-producing and `fail`ing versions of an +operation. It should provide just the `Result` version, allowing clients to use +`try!` or `unwrap` instead as needed. This is part of the general pattern of +cutting down on redundant variants by instead using method chaining. + +There is one exception to this rule, however. Some APIs are strongly oriented +around failure, in the sense that their functions/methods are explicitly +intended as assertions. If there is no other way to check in advance for the +validity of invoking an operation `foo`, however, the API may provide a +`foo_catch` variant that returns a `Result`. + +The main examples in `libstd` providing both variants are: + +* Channels, which are the primary point of failure propagation between tasks. As + such, calling `recv()` is an _assertion_ that the other end of the channel is + still alive, which will propagate failures from the other end of the + channel. On the other hand, since there is no separate way to atomically test + whether the other end has hung up, channels provide a `recv_opt` variant that + produces a `Result`. + + > **[FIXME]** The `_opt` suffix needs to be replaced by a `_catch` suffix. + +* `RefCell`, which provides a dynamic version of the borrowing rules. Calling + the `borrow()` method is intended as an assertion that the cell is in a + borrowable state, and will `fail!` otherwise. On the other hand, there is no + separate way to check the state of the `RefCell`, so the module provides a + `try_borrow` variant that produces a `Result`. + + > **[FIXME]** The `try_` prefix needs to be replaced by a `_catch` catch. + +# Drawbacks + +The main drawbacks of this proposal are: + +* Task failure remains somewhat of a landmine: one must be sure to document, and + be aware of, all relevant function contracts in order to avoid task failure. + +* The choice of what to make part of a function's contract remains somewhat + subjective, so these guidelines cannot be used to decisively resolve + disagreements about an API's design. + +The alternatives mentioned below do not suffer from these problems, but have +drawbacks of their own. + +# Alternatives + +[Two](https://github.com/rust-lang/rfcs/pull/204) +[alternative](https://github.com/rust-lang/rfcs/pull/220) designs have been +given in earlier RFCs, both of which take a much harder line on using `fail!` +(or, put differently, do not allow most functions to have contracts). + +As was +[pointed out by @SiegeLord](https://github.com/rust-lang/rfcs/pull/220#issuecomment-54715268), +however, mixing what might be seen as contract violations with obstructions can +make it much more difficult to write obstruction-robust code; see the linked +comment for more detail. From 6abf70abcebdcb38c5c7bed1aab1262645f3d93a Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Thu, 11 Sep 2014 16:20:55 -0700 Subject: [PATCH 2/4] Clarify FIXMEs --- active/0000-error-conventions.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/active/0000-error-conventions.md b/active/0000-error-conventions.md index 5b53bf53016..db983fe9c29 100644 --- a/active/0000-error-conventions.md +++ b/active/0000-error-conventions.md @@ -165,7 +165,8 @@ The main examples in `libstd` providing both variants are: whether the other end has hung up, channels provide a `recv_opt` variant that produces a `Result`. - > **[FIXME]** The `_opt` suffix needs to be replaced by a `_catch` suffix. + > Note: the `_opt` suffix would be replaced by a `_catch` suffix if this RFC + > is accepted. * `RefCell`, which provides a dynamic version of the borrowing rules. Calling the `borrow()` method is intended as an assertion that the cell is in a @@ -173,7 +174,8 @@ The main examples in `libstd` providing both variants are: separate way to check the state of the `RefCell`, so the module provides a `try_borrow` variant that produces a `Result`. - > **[FIXME]** The `try_` prefix needs to be replaced by a `_catch` catch. + > Note: the `try_` prefix would be replaced by a `_catch` catch if this RFC is + > accepted. # Drawbacks From 8e6d015e53ec71beba023f06fd66162a44b58a5d Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Sun, 14 Sep 2014 20:56:53 -0700 Subject: [PATCH 3/4] Revision: add some clarifications and naming alternatives --- active/0000-error-conventions.md | 42 +++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/active/0000-error-conventions.md b/active/0000-error-conventions.md index db983fe9c29..0cf0379c3b5 100644 --- a/active/0000-error-conventions.md +++ b/active/0000-error-conventions.md @@ -82,8 +82,9 @@ An API may define a contract that goes beyond the type checking enforced by the compiler. For example, slices support an indexing operation, with the contract that the supplied index must be in bounds. -Contracts can be complex. For example, the `RefCell` type requires that -`borrow_mut` not be called until all existing borrows have been relinquished. +Contracts can be complex and involve more than a single function invocation. For +example, the `RefCell` type requires that `borrow_mut` not be called until all +existing borrows have been relinquished. ### For contract violations, fail the task. @@ -114,13 +115,22 @@ contracts. However, here are some rough guidelines: patterns where the client does not want to check inputs before hand, but would rather attempt the operation and then find out whether the inputs were invalid. +* When a contract violation is the *only* kind of error a function may encounter + -- i.e., there are no obstructions to its success other than "bad" inputs -- + using `Result` or `Option` instead is especially warranted. Clients can then use + `unwrap` to assert that they have passed valid input, or re-use the error + checking done by the API for their own purposes. + * When in doubt, use loose contracts and instead return a `Result` or `Option`. ## Obstructions An operation is *obstructed* if it cannot be completed for some reason, even -though the operation's contract has been satisfied. Obstructed operations must -still leave the relevant data structures in a coherent state. +though the operation's contract has been satisfied. Obstructed operations may +have (documented!) side effects -- they are not required to roll back after +encountering an obstruction. However, they should leave the data structures in +a "coherent" state (satisfying their invariants, continuing to guarantee safety, +etc.). Obstructions may involve external conditions (e.g., I/O), or they may involve aspects of the input that are not covered by the contract. @@ -138,7 +148,8 @@ obstructions in a fine-grained way. If there are multiple ways an operation might be obstructed, or there is useful information about the obstruction (such as where in the input a parse error occurred), prefer to use `Result`. For operations with a single obvious -obstruction (like popping from an empty stack), use `Option`. +obstruction that can provide no additional information (e.g. popping from an +empty stack), use `Option`. (Currently, `Option` does not interact well with other error-handling infrastructure like `try!`, but this will likely be improved in the future.) @@ -156,7 +167,7 @@ intended as assertions. If there is no other way to check in advance for the validity of invoking an operation `foo`, however, the API may provide a `foo_catch` variant that returns a `Result`. -The main examples in `libstd` providing both variants are: +The main examples in `libstd` that *currently* provide both variants are: * Channels, which are the primary point of failure propagation between tasks. As such, calling `recv()` is an _assertion_ that the other end of the channel is @@ -177,6 +188,8 @@ The main examples in `libstd` providing both variants are: > Note: the `try_` prefix would be replaced by a `_catch` catch if this RFC is > accepted. +(Note: it is unclear whether these APIs will continue to provide both variants.) + # Drawbacks The main drawbacks of this proposal are: @@ -203,3 +216,20 @@ As was however, mixing what might be seen as contract violations with obstructions can make it much more difficult to write obstruction-robust code; see the linked comment for more detail. + +## Naming + +There are numerous possible suffixes for a `Result`-producing variant: + +* `_catch`, as proposed above. As + [@kballard points out](https://github.com/rust-lang/rfcs/pull/236#issuecomment-55344336), + this name connotes exception handling, which could be considered + misleading. However, since it effectively prevents further unwinding, catching + an exception may indeed be the right analogy. + +* `_result`, which is straightforward but not as informative/suggestive as some + of the other proposed variants. + +* `try_` prefix. Also connotes exception handling, but has an unfortunately + overlap with the common use of `try_` for nonblocking variants (which is in + play for `recv` in particular). From aee1de5371cfaea445796e7faa0596b509ab7bb6 Mon Sep 17 00:00:00 2001 From: Aaron Turon Date: Wed, 15 Oct 2014 08:31:57 -0700 Subject: [PATCH 4/4] Clarifications. --- {active => text}/0000-error-conventions.md | 38 +++++++++++++--------- 1 file changed, 22 insertions(+), 16 deletions(-) rename {active => text}/0000-error-conventions.md (89%) diff --git a/active/0000-error-conventions.md b/text/0000-error-conventions.md similarity index 89% rename from active/0000-error-conventions.md rename to text/0000-error-conventions.md index 0cf0379c3b5..d71f67b1d4b 100644 --- a/active/0000-error-conventions.md +++ b/text/0000-error-conventions.md @@ -47,7 +47,7 @@ but it tries to motivate and clarify them. Errors fall into one of three categories: -* Catastrophic errors, e.g. out of memory. +* Catastrophic errors, e.g. out-of-memory. * Contract violations, e.g. wrong input encoding, index out of bounds. * Obstructions, e.g. file not found, parse error. @@ -67,14 +67,11 @@ Catastrophic errors are _extremely_ rare, especially outside of `libstd`. **Canonical examples**: out of memory, stack overflow. -### For catastrophic errors, abort the process or fail the task. +### For catastrophic errors, fail the task. -For errors like out-of-memory, even correctly unwinding is impossible (since it -may require allocation); the process is aborted. - -For errors like stack overflow, Rust currently aborts the process, but could in -principle fail the task, which would allow reporting and recovery from a -supervisory task. +For errors like stack overflow, Rust currently aborts the process, but +could in principle fail the task, which (in the best case) would allow +reporting and recovery from a supervisory task. ## Contract violations @@ -137,7 +134,7 @@ aspects of the input that are not covered by the contract. **Canonical examples**: file not found, parse error. -### For obstructions, use `Result` (or `Option`). +### For obstructions, use `Result` The [`Result` type](http://static.rust-lang.org/doc/master/std/result/index.html) @@ -145,14 +142,23 @@ represents either a success (yielding `T`) or failure (yielding `E`). By returning a `Result`, a function allows its clients to discover and react to obstructions in a fine-grained way. -If there are multiple ways an operation might be obstructed, or there is useful -information about the obstruction (such as where in the input a parse error -occurred), prefer to use `Result`. For operations with a single obvious -obstruction that can provide no additional information (e.g. popping from an -empty stack), use `Option`. +#### What about `Option`? + +The `Option` type should not be used for "obstructed" operations; it +should only be used when a `None` return value could be considered a +"successful" execution of the operation. + +This is of course a somewhat subjective question, but a good litmus +test is: would a reasonable client ever ignore the result? The +`Result` type provides a lint that ensures the result is actually +inspected, while `Option` does not, and this difference of behavior +can help when deciding between the two types. -(Currently, `Option` does not interact well with other error-handling -infrastructure like `try!`, but this will likely be improved in the future.) +Another litmus test: can the operation be understood as asking a +question (possibly with sideeffects)? Operations like `pop` on a +vector can be viewed as asking for the contents of the first element, +with the side effect of removing it if it exists -- with an `Option` +return value. ## Do not provide both `Result` and `fail!` variants.