Error wrapping and conversion between modules #68

ancazamfir · 2020-05-20T13:50:04Z

Summary

Current relayer uses the anomaly crate for defining error "kinds" and handling errors. It is not clear if and how conversion between different errors or "kinds" can be done.

Problem Definition

There are interactions between IBC modules where error handling is currently awkward. For example the channel message creation (ICS4) verifies that input is properly formatted by invoking validation part of the "host" (ICS24):

   (...
          port_id: String,
    )
....
            port_id: port_id
                .parse()
                .map_err(|e| crate::ics04_channel::error::Kind::IdentifierError.context(e))?,

This would be nice:

            port_id: port_id.parse()?,

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

The text was updated successfully, but these errors were encountered:

adizere · 2020-11-12T15:58:27Z

Seems like tm-rs is getting some activity on this front soon: informalsystems/tendermint-rs#669.

Since the anomaly crate will become deprecated, we'll also need to move away to something else.

tarcieri · 2020-11-12T16:44:40Z

Here's an example of moving from anomaly to eyre: iqlusioninc/crates#555

eyre has a unified eyre::Report type for errors and an eyre::Result type that eliminates the need for the generic error parameter.

Together those should eliminate the need for any error wrapping/conversions entirely. Here's an example of what an error.rs looks like after switching to eyre:

https://github.com/iqlusioninc/crates/pull/555/files#diff-e1562983b9a9dca8f286969d7c6506314d6f3c215da068f3172356961d0cc048

Note that we also plan on moving Abscissa to use eyre, although I don't have an exact timeframe on that.

adizere · 2021-01-02T09:33:15Z

After playing with eyre (#493), and given our experience with anomaly so far, here are some observations:

We cannot get rid of all 'awkwardness' around error handling

With this issue we originally wanted to get rid of patterns such as

   (...
          port_id: String,
    )
....
            port_id: port_id
                .parse()
                .map_err(|e| crate::ics04_channel::error::Kind::IdentifierError.context(e))?,

and replace them with:

            port_id: port_id.parse()?,

This replacement is possible with eyre, but not always desirable. Sometimes we may want to chain errors and wrap them into one another to capture both the high-level error cause as well as the lower-level cause(s). In the example above, error chaining is useful because it will report on a high level error (namely, channel msg. creation) and also point to an underlying error (identifier error in parsing the port id). Below is an example where error chaining is not useful, and we can get rid of it without impacting debugging/usability:

With error wrapping

For the keys list command, we can wrap any lower-level keybase error into a ErrorMsg::Keys error:
https://github.com/informalsystems/ibc-rs/blob/e4629358526564e26850c071b4cb030485d52c8c/relayer-cli/src/commands/keys/list.rs#L50-L55

The output in the error case would look like this:

keys list failed: keys error: Keybase error

Without error wrapping

The code becomes:

let res = list_keys(opts);
// ...

Output becomes:

keys list failed: Keybase error

As a rule of thumb, I think that the deeper we are in the stack, the more we may wish to chain errors (with wrap/map) instead of simply propagating them.

Some advantages of using eyre

We can simplify by removing the `.context`

For example this

match value.type_url.as_str() {
            TENDERMINT_CONSENSUS_STATE_TYPE_URL => Ok(AnyConsensusState::Tendermint(
                TendermintConsensusState::decode_vec(&value.value)
                    .map_err(|e| Kind::InvalidRawConsensusState.context(e))?,

becomes this (last line is relevant)

match value.type_url.as_str() {
            TENDERMINT_CONSENSUS_STATE_TYPE_URL => Ok(AnyConsensusState::Tendermint(
                TendermintConsensusState::decode_vec(&value.value)
                    .wrap_err(Kind::InvalidRawConsensusState)?,

or another example, where this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| Kind::InvalidRawClientState.context("missing client state"))?;

becomes this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| eyre!("missing client_state"))?;

Instead of eyre!("missing client_state") we could alternatively introduce a typed error such as Kind::MissingClientState. But the former solution was simpler to adopt for this early stage.

Some error conversion becomes simpler

In the process method of handler/conn_open_ack.rs, for example:

        None => {
            // No connection end exists for this conn. identifier. Impossible to continue handshake.
-            Err(Into::<Error>::into(Kind::UninitializedConnection(
-                msg.connection_id().clone(),
-           )))
+            Err(Kind::UninitializedConnection(msg.connection_id().clone()))

Some naming conventions become simpler

We used to have two types for managing errors (see also informalsystems/tendermint-rs#150 (comment)):

pub type Error = anomaly::Error<Kind>; which is redefined in every module and assigned the same anomaly type
pub enum Kind is the enumeration of module-specific errors

Using eyre we:

get rid of redefining the Error type everywhere; instead, we use eyre::Report. For example:

 impl TryFrom<Any> for AnyHeader {
-    type Error = crate::ics02_client::error::Error;
+    type Error = eyre::Report;

keep the enumeration Kind and rename that into Error (this is how eyre seems to be intended for use, example)

Disadvantages of using eyre

the eyre::Report type contains the error and that makes it difficult to handle errors by match or reporting (see this downcast example or this) on them
In general, it seems like we are moving away from core Rust. Below are two examples of code (one where eyre::Result replaces core::Result, and one where we replace map_err with wrap_err) that looks cleaner with eyre, but has the downside of making us more dependent on the eyre library.

-    pub fn sub(&self, delta: u64) -> Result<Height, Error> {
+    pub fn sub(&self, delta: u64) -> eyre::Result<Height> {

-                    .map_err(|e| Kind::InvalidRawClientState.context(e))?,
+                    .wrap_err(Error::InvalidRawClientState)?,

tarcieri · 2021-01-02T16:00:48Z

In general, it seems like we are moving away from core Rust. Below are two examples of code (one where eyre::Result replaces core::Result, and one where we replace map_err with wrap_err) that looks cleaner with eyre, but has the downside of making us more dependent on the eyre library.

I'd say it's pretty common to define aliases of Result regardless of what error handling approach is being used.

I think it's especially nice for consumers of a downstream crate which need to impl a trait that uses the downstream crate's Result type, e.g. -> somecrate::Result<T>.

io::Result is an example of this in std.

adizere · 2021-01-04T10:01:13Z

Thank you @tarcieri, I wasn't familiar that this is a common approach. The example with std::io::Result drives the point home very well. I guess the second disadvantage I mentioned ("moving away from core") is not so much of a problem.

The other disadvantage I mentioned above has to do with the difficulty of matching or handling a eyre::Report. As stated in their docs:

Use eyre if you don't think you'll do anything with an error other than report it.

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle, for example:

failed to submit a tx, and therefore we'd want to retry that,
failed a light client operation, and we may want to retry that,
failed to get a query responses from a full node, which we may want to retry as well.

All of these are network calls. That being said, it will not be impossible to match the error and retry (just not very clean).

romac · 2021-01-04T15:22:50Z

We cannot get rid of all 'awkwardness' around error handling

...

Sometimes we may want to chain errors and wrap them into one another to capture both the high-level error cause as well as the lower-level cause(s). In the example above, error chaining is useful because it will report on a high level error (namely, channel msg. creation) and also point to an underlying error (identifier error in parsing the port id).

Agreed. A potential solution for tackling the verbosity of wrapping the parse error in a high-level error would be to introduce a parse_port_id function somewhere in the ics04_channel module:

fn parse_port_id(port_id: &str) -> Result<PortId, crate::ics04_channel::error::Error> {
  port_id.parse().map_err(crate::ics04_channel::error::Error::InvalidIdentifier)
}

which would be used as:

...
  port_id: parse_port_id(&port_id)?,
...

with the crate::ics04_channel::error::Error being defined as:

#[derive(Error)]
enum Error {
  InvalidIdentifier(ValidationError),
}

romac · 2021-01-04T15:31:44Z

As a rule of thumb, I think that the deeper we are in the stack, the more we may wish to chain errors (with wrap/map) instead of simply propagating them.

Agreed. In my opinion, library APIs should return specific, nested errors via a Result<T, my_module::Error> type. The eyre::Report type should only be used in "top-level" application code in cases where one does not need to statically know about the specific kind of error being returned (keeping in mind such code can always fall back on downcasts to recover the specific underlying error if needed).

tarcieri · 2021-01-04T15:32:20Z

...the difficulty of matching or handling a eyre::Report. As stated in their docs:

Use eyre if you don't think you'll do anything with an error other than report it.

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle

@adizere in general it seems people use a combination of error handling strategies, selecting which one most appropriately fits their needs for each given scenario.

This generally looks like using an enum (with a library like thiserror if you so desire) for lower-level libraries, and a library like eyre (or alternatively anyhow, but eyre seems to have largely surpassed it) for higher-level error handling, particularly at the application level.

romac · 2021-01-04T15:35:29Z

or another example, where this:
let raw_client_state = raw
            .client_state
            .ok_or_else(|| Kind::InvalidRawClientState.context("missing client state"))?;
becomes this:
let raw_client_state = raw
            .client_state
            .ok_or_else(|| eyre!("missing client_state"))?;
Instead of eyre!("missing client_state") we could alternatively introduce a typed error such as Kind::MissingClientState. But the former solution was simpler to adopt for this early stage.

I'd argue that introducing a specific error type is both cleaner and more useful as it allows the enclosing function to return a Result<T, my_module::Error> rather than have to use eyre::Report (cf. #68 (comment) and #68 (comment)).

romac · 2021-01-04T15:55:06Z

This generally looks like using an enum (with a library like thiserror if you so desire) for lower-level libraries, and a library like eyre (or alternatively anyhow, but eyre seems to have largely surpassed it) for higher-level error handling, particularly at the application level.

Agreed.

romac · 2021-01-04T15:56:13Z

Regarding .context, I would add that while anomaly's context method is very useful to add "dynamic" context to an error, we should instead be more principled and define appropriate hierarchies of nested error types and use those at the library level.

romac · 2021-01-04T16:08:44Z

@adizere Thanks for the great write-up!

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle, for example:

failed to submit a tx, and therefore we'd want to retry that,

failed a light client operation, and we may want to retry that,

failed to get a query responses from a full node, which we may want to retry as well.

All of these are network calls. That being said, it will not be impossible to match the error and retry (just not very clean).

In general, I agree with your conclusions, and in light of the comment of yours quoted above, I would suggest we do the following:

return specific errors (ie. Result<T, MyError>) in the modules crate (those being defined as plain, nested enums with help from the this error crate for deriving the std::error::Error trait)
return dynamic errors (eyre::Report) via eyre::Result<T> aliased as relayer::Result<T> in the relayer crate
same in the relayer-cli crate

This would let us:

a) precisely define all the various errors and their relationship in the low-level modules crate
b) easily catch/handle the errors yielded by the modules crate in the relayer, eg. for retrying some operations
c) easily bubble up any errors we deem unrecoverable from the relayer and relayer-cli crate

What do you think?

adizere · 2021-01-05T09:34:17Z

What do you think?

In short, I love it!

Both the thread of discussion as well as the last suggestions are great, so I have nothing to add at this moment. I'll align #493 with these comments and then we'll see where we go from there. Many thanks @tarcieri & @romac.

adizere · 2021-08-16T06:57:14Z

This issue is no longer relevant with the introduction of flex-error #988

ancazamfir added the I: logic Internal: related to the relaying logic label May 20, 2020

ancazamfir added this to the 0.6-6mo milestone May 20, 2020

ancazamfir changed the title ~~Error wrapping and conversion betweeen module~~ Error wrapping and conversion between modules May 26, 2020

ancazamfir modified the milestones: 0.6-6mo, 0.7-7mo Jun 16, 2020

ancazamfir mentioned this issue Jun 29, 2020

High Level Goal - IBC Relayer #114

Closed

ancazamfir modified the milestones: v.0.0.3, v0.0.7 Aug 14, 2020

This was referenced Nov 6, 2020

IBC Modules ADR #3

Closed

Relayer v0 MVP #385

Closed

adizere added the I: dependencies Internal: related to dependencies label Nov 12, 2020

ancazamfir modified the milestones: v0.0.7, v0.0.8 Dec 23, 2020

adizere mentioned this issue Jan 2, 2021

WIP Investigation into replacing anomaly with eyre for error handling #493

Closed

5 tasks

adizere modified the milestones: v0.1.1, v0.1.2 Feb 9, 2021

adizere mentioned this issue Sep 28, 2022

MBT low-friction maintenance cosmos/ibc-rs#29

Closed

5 tasks

adizere mentioned this issue May 17, 2021

Fine grained error handling #950

Closed

5 tasks

adizere closed this as completed Aug 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error wrapping and conversion between modules #68

Error wrapping and conversion between modules #68

ancazamfir commented May 20, 2020

adizere commented Nov 12, 2020 •

edited

Loading

tarcieri commented Nov 12, 2020

adizere commented Jan 2, 2021

tarcieri commented Jan 2, 2021 •

edited

Loading

adizere commented Jan 4, 2021

romac commented Jan 4, 2021 •

edited

Loading

We cannot get rid of all 'awkwardness' around error handling

romac commented Jan 4, 2021 •

edited

Loading

tarcieri commented Jan 4, 2021

romac commented Jan 4, 2021 •

edited

Loading

romac commented Jan 4, 2021

romac commented Jan 4, 2021

romac commented Jan 4, 2021 •

edited

Loading

adizere commented Jan 5, 2021

adizere commented Aug 16, 2021

Error wrapping and conversion between modules #68

Error wrapping and conversion between modules #68

Comments

ancazamfir commented May 20, 2020

Summary

Problem Definition

For Admin Use

adizere commented Nov 12, 2020 • edited Loading

tarcieri commented Nov 12, 2020

adizere commented Jan 2, 2021

We cannot get rid of all 'awkwardness' around error handling

With error wrapping

Without error wrapping

Some advantages of using eyre

We can simplify by removing the .context

Some error conversion becomes simpler

Some naming conventions become simpler

Disadvantages of using eyre

tarcieri commented Jan 2, 2021 • edited Loading

adizere commented Jan 4, 2021

romac commented Jan 4, 2021 • edited Loading

We cannot get rid of all 'awkwardness' around error handling

romac commented Jan 4, 2021 • edited Loading

tarcieri commented Jan 4, 2021

romac commented Jan 4, 2021 • edited Loading

romac commented Jan 4, 2021

romac commented Jan 4, 2021

romac commented Jan 4, 2021 • edited Loading

adizere commented Jan 5, 2021

adizere commented Aug 16, 2021

adizere commented Nov 12, 2020 •

edited

Loading

We can simplify by removing the `.context`

tarcieri commented Jan 2, 2021 •

edited

Loading

romac commented Jan 4, 2021 •

edited

Loading

romac commented Jan 4, 2021 •

edited

Loading

romac commented Jan 4, 2021 •

edited

Loading

romac commented Jan 4, 2021 •

edited

Loading