Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The big error refactor #2566

Merged
merged 29 commits into from
Apr 13, 2022
Merged

The big error refactor #2566

merged 29 commits into from
Apr 13, 2022

Conversation

spacekookie
Copy link
Contributor

@spacekookie spacekookie commented Mar 18, 2022

(original message) This PR is a current rebase and fix of #2492, while also applying its changes to how ockam_node uses error data.

Current state

What a massive refactor this is turning out to be but I think we're almost there.

  • Replace all instances of ockam_core::old_error with ockam_core::Error2
  • Roughly map existing error codes and structures onto (Origin, Kind) tuple
  • Rename Error2 to Error
  • Simplify Error2 API to remove requirement for ErrorCode to be public
  • Adjust Kind types for crate-local errors (currently I've mostly chosen Misuse and Invalid. This is not accurate for all wrapped error types and we should have a think about how we classify each category of error to find the most appropriate mapping. More importantly: do we want to encourage users to handle different kinds of errors differently? Or should we do so internally? i.e. panic on some but not others?)
  • Fix no_std usage

@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch from 2c5a9b5 to 5b6b286 Compare March 18, 2022 13:55
@spacekookie
Copy link
Contributor Author

There's one issue on no_std that I can't really figure out why it doesn't work and how it's supposed to work @thomcc maybe you can have a look at it?

error[E0277]: the trait bound `Box<(dyn compat::error::Error + Sync + core::marker::Send + 'static)>: From<old_error::Error>` is not satisfied
  --> implementations/rust/ockam/ockam_core/src/error/mod.rs:88:25
   |
88 |         Error2::new(ec, src).context("domain", orig_domain)
   |         -----------     ^^^ the trait `From<old_error::Error>` is not implemented for `Box<(dyn compat::error::Error + Sync + core::marker::Send + 'static)>`
   |         |
   |         required by a bound introduced by this call
   |
   = help: the following implementations were found:
             <Box<(dyn core2::error::Error + 'a)> as From<E>>
             <Box<(dyn core2::error::Error + 'static)> as From<&str>>
             <Box<(dyn core2::error::Error + 'static)> as From<Cow<'a, str>>>
             <Box<(dyn core2::error::Error + 'static)> as From<alloc::string::String>>
           and 13 others
   = note: required because of the requirements on the impl of `Into<Box<(dyn compat::error::Error + Sync + core::marker::Send + 'static)>>` for `old_error::Error`
note: required by a bound in `Error2::new`
  --> implementations/rust/ockam/ockam_core/src/error/mod.rs:36:12
   |
34 |     pub fn new<E>(code: code::ErrorCode, cause: E) -> Self
   |            --- required by a bound in this
35 |     where
36 |         E: Into<Box<dyn crate::compat::error::Error + Send + Sync>>,
   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `Error2::new`

@thomcc
Copy link
Contributor

thomcc commented Mar 22, 2022

Didn't get to this today, but will take a look tomorrow. Feel free to message me if there's anything else you need from me tomorrow (stuff to land aside from pipe2, I mean).

@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch from 5b6b286 to 7572686 Compare March 24, 2022 16:36
@spacekookie spacekookie changed the title WIP: error refactor in ockam_node WIP: the big error refactor Mar 29, 2022
@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch 2 times, most recently from 0444b8f to acad2f4 Compare March 29, 2022 15:32
@spacekookie
Copy link
Contributor Author

spacekookie commented Mar 29, 2022

Progress update

I talked to @thomcc and I think we should push through with the error refactoring on this branch. It'll create too much breakage everywhere else (looking at CI...) if we do this gradually. So let's just pull off the bandaid.

Some thoughts on the new error system and how we should use it below

Crate-local vs module-local error types

Currently we have crate-local error types for node, ockam, transport, etc. For some of these systems a crate-wide error type makes sens. But for other things they don't. Because the underlying error types are more flexible we can have more granular Error types. Any underlying failure information (for example from a bad serialisation, etc) should then be included in the local error type.

Following is an incomplete list of error types we need to re-introduce into this system:

  • ockam_core::routing
  • ockam_core::messages (encoding/ schema errors down the line)
  • ockam_node::context (these should be high level errors aimed to communicate as much as possible to users about how they misused the API)
  • ockam_node::internal (currently all "internal io failure" errors)

... to be expanded

Updated crates

  • ockam_core
  • ockam_node
  • ockam_transport_core
  • ockam_transport_tcp
  • ockam_channel
  • ockam

@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch 4 times, most recently from d75c4ad to 55af930 Compare March 31, 2022 14:37
@spacekookie spacekookie changed the title WIP: the big error refactor The big error refactor Mar 31, 2022
@spacekookie spacekookie marked this pull request as ready for review March 31, 2022 14:38
Copy link
Contributor

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This largely looks good to me. Someone else should review ockam_core::error::* though, since I wrote that code.

#[cfg(test)]
mod test {
use ockam_core::compat::collections::HashMap;
// #[cfg(test)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just delete.

ockam_core::Error::new(BleError::DOMAIN_CODE + e.code(), BleError::DOMAIN_NAME)
impl From<BleError> for Error2 {
fn from(err: BleError) -> Error2 {
Error2::new(ErrorCode::new(Origin::Transport, Kind::Io), err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these should have more specific categories. E.g. timeout, etc.

@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch 3 times, most recently from bc5a803 to 84484e9 Compare April 7, 2022 15:55
@thomcc
Copy link
Contributor

thomcc commented Apr 8, 2022

@spacekookie

  1. Ended up having to remove thiserror -- doesn't support no_std :(. I also think we'd probably like a macro of our
  2. Removed NoneError too. Two reasons:
    • The design already supported unknown cause (hence the cause being an Option).
    • You were using it to indicate missing/unknown error cause, but the display impl said "Option was None" (or something like this), which would be confusing in the future, and was probably a mistake easy for others to make

That said this all revealed some serious flaws in this design! In particular, the no_std error handling becomes extremely painful. Use of Into<Box<dyn Error>> the way we always do in libstd is... not viable here (while we have a std::error::Error-like trait in compat, external error types do not use it). In the short term I hacked around this, but long term I'll need to think more, and maybe ask for some advice.

Regarding other failures: I think the FFI error changes may need to be reverted entirely -- Elixir failing means the are probably depending on some error code being a specific value. That, or you can fix the elixir (I'm unsure how widespread it is, and suspect our tests catching it may be luck (since I'd guess our elixir code does not test coverage of error handling much). That said, FfiError can be its own thing for a bit, since ockam-ffi is not a dependency of anything rust stuff.

(EDIT: actually, it doesn't seem to be failing now maybe? IDK)

@thomcc
Copy link
Contributor

thomcc commented Apr 8, 2022

@spacekookie Alright, I think that's all the CI failures, but you can take anything that's left (and the rebase).

@spacekookie
Copy link
Contributor Author

spacekookie commented Apr 8, 2022

@thomcc:

1. Ended up having to remove `thiserror` -- doesn't support no_std :(. I also think we'd probably like a macro of our

Well that's disappointing. I thought it supported no_std 😔 Oh well, manual Display's it is.

2. Removed NoneError too. Two reasons:
   
   * The design already supported unknown cause (hence the cause being an Option).
   * You were using it to indicate missing/unknown error cause, but the display impl said "Option was None" (or something like this), which would be confusing in the future, and was probably a mistake easy for others to make

Yea true, I first added the NoneError and then added the _without_cause() functions which made it irrelevant again. I was kinda on the fence around keeping it in at this point so good riddance 😁

Anyway, thanks for pushing it across the finish line. I'll give it another round of review today and then I think we should merge this ASAP before something else needs to be re-based

@spacekookie
Copy link
Contributor Author

spacekookie commented Apr 8, 2022

Interestingly enough there's a thiserror PR that adds no_std support but hasn't been touched since 2020 dtolnay/thiserror#64

spacekookie and others added 23 commits April 13, 2022 17:49
This commit marks some tests as "ignore" because the constraints they
were checking for don't exist anymore.  We should evaluate each test
in accordance of its constraints and then re-write them for the new system
This commit adds a new class of crate error to ockam_node while
refactoring the previous `RouterError`.  The new error types map onto
classes of problems, which interact with the `Kind` system introduced
by ockam_core and attaching any relevant causing errors via the
`.context(..)` mechanism.
@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch from e34867e to c553fb9 Compare April 13, 2022 16:18
@spacekookie spacekookie force-pushed the spacekookie/ga/error-refactor branch from 57d8a53 to 61d539b Compare April 13, 2022 16:51
Copy link
Contributor

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's more to do here but can mostly be done in a follow up. Thanks! I'll be merging this.

@thomcc thomcc merged commit 61d539b into develop Apr 13, 2022
@thomcc thomcc deleted the spacekookie/ga/error-refactor branch May 1, 2022 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants