Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: function! macro for error context #317

Merged
merged 3 commits into from
Dec 5, 2023
Merged

Conversation

ibeckermayer
Copy link
Contributor

@zmb3 suggested we add this for Teleport to avoid copy-paste errors wherever we're using these macros, however it occurred to me that this might best fit as a more general solution throughout IronRDP.

I tested this by applying the following patch

diff --git a/crates/ironrdp-cliprdr/src/pdu/capabilities.rs b/crates/ironrdp-cliprdr/src/pdu/capabilities.rs
index 508b4e5..34381af 100644
--- a/crates/ironrdp-cliprdr/src/pdu/capabilities.rs
+++ b/crates/ironrdp-cliprdr/src/pdu/capabilities.rs
@@ -166,10 +166,10 @@ impl<'de> PduDecode<'de> for CapabilitySet {
         let _length = src.read_u16();
 
         match caps_type {
-            Self::CAPSTYPE_GENERAL => {
-                let general = GeneralCapabilitySet::decode(src)?;
-                Ok(Self::General(general))
-            }
+            // Self::CAPSTYPE_GENERAL => {
+            //     let general = GeneralCapabilitySet::decode(src)?;
+            //     Ok(Self::General(general))
+            // }
             _ => Err(invalid_message_err!(
                 "capabilitySetType",
                 "invalid clipboard capability set type"

and running a session (via teleport), which results in a log message like

SessionError(Error { context: "invalid payload", kind: Pdu(Error { context: "<ironrdp_cliprdr::pdu::capabilities::CapabilitySet as ironrdp_pdu::PduDecode>::decode", kind: InvalidMessage { field: "capabilitySetType", reason: "invalid clipboard capability set type" }, source: None }), source: None })

the key piece being context: "<ironrdp_cliprdr::pdu::capabilities::CapabilitySet as ironrdp_pdu::PduDecode>::decode"

Deletes any now-superfluous `const NAME: &'static str = "name";` definitions.

This gives a path to the function that is being called, which is useful
for debugging errors. It replaces the necessity of implementing
`const NAME: &'static str = "name";` in every function that wants to
use these macros (or coming up with a custom context).
@CBenoit
Copy link
Member

CBenoit commented Dec 4, 2023

I explored this approach in the past and decided to not use it at the time. More specifically, I used a similar trick to find the name of the type instead of the name of the function. A minor inconvenience is that the actual representation is more verbose, not guaranteed to be stable, and some tests may break from time to time because of that. IMO, this is not a big deal because it’s easy to regenerate these (we are using expect_test), and we typically don’t match on the context field of the error for control flow (that would not be a great idea anyway).

An alternative I was thinking about is:

/// A type with a static name.
pub trait StaticName {
    /// Name associated to this type.
    const NAME: &'static str;
}

/// A type from which a name can be retrieved.
pub trait GetName {
    /// Returns the name associated to this value.
    fn get_name(&self) -> &'static str;
}

impl<T: StaticName> GetName for T {
    fn get_name(&self) -> &'static str {
        T::NAME
    }
}

assert_obj_safe!(GetName);

/// Gets the name of this value.
pub fn get_name<T: GetName>(value: &T) -> &'static str {
    value.get_name()
}

macro_rules! impl_static_name {
    ($ty:ident) => {{
        impl StaticName for $ty {
            const NAME: &'static str = stringify!($ty);
        }
    }};
}

struct Example;

impl_static_name!(Example);

And this could be part of another macro that we should typically apply on most decodable / encodable types, so we would get it for "free" as well.

This also avoid copy-paste errors when using impl_static_name! because we stringify the type name along the trait implementation.

The benefit is that we get a shorter name, and we have the GetName trait which is object-safe and could be used for more dynamic diagnostics (e.g.: print the name of the PDUs returned by a static channel).

On the other hand, having the name of the function is not bad either.

I’m not yet settled on either one as of today. Thoughts?

@ibeckermayer
Copy link
Contributor Author

@CBenoit it may be worth clarifying precisely what we want to get from this context field (and these errors more generally). From the example that I gave, we appear to arguably have competing semantics currently:

SessionError(Error { context: "invalid payload", kind: Pdu(Error { context: "<ironrdp_cliprdr::pdu::capabilities::CapabilitySet as ironrdp_pdu::PduDecode>::decode", kind: InvalidMessage { field: "capabilitySetType", reason: "invalid clipboard capability set type" }, source: None }), source: None })

In the SessionError, the context is something like a reason/general error message, whereas in the Pdu error it's more like the name of the pdu where the error occurred (before these changes), and now it's roughly name-of-pdu + function. Granted one could argue that for different error types context can reasonably have different meanings, but I wonder if doing so is making things unnecessarily complicated.

Personally when I hit any error, I'm looking for

  1. A way to efficiently, accurately identify where in the code the error came from
  2. Information about the cause of the error
  3. A message to display to the user in the UI (in cases where that's relevant) which doesn't expose irrelevant or security-sensitive information.

From those axioms I would imagine a "base class" that contains that information, and any "subclasses" (like PduError/PduErrorKind) being ways to efficiently communicate 2, and/or as a means for programmatically handling errors further up the call stack based on different semantics. (This is just a first pass at a "theory of error types", having read no other thinking on the topic in depth, so take it with that grain of salt. I'm also not suggesting we redo the entire error system at this point, just trying to clarify some of the thought process for myself).

With the use of this function!() macro I'm essentially attempting to define these context fields as a means to achieve 1. I've been presuming that was the original intent, though feel free to correct me if I'm wrong.

A key advantage I see to function!() macro as compared to your alternative is that it gets the job of 1 done without needing to know much about the structure of the underlying error system. This can be useful as a caller of the library, because I can just start using these error types and macros without needing to implement any other macros for my classes. For example in Teleport we're using PduResult and returning a lot of custom_err! (e.g. here). Arguably that's an abuse of PduResult semantics, but point is that it makes it easier to get 1 consistently wherever we are in the codebase.

That's a long rambling take, I could say more but I want to hear your take on the proper meaning of context before rambling any further.

@CBenoit
Copy link
Member

CBenoit commented Dec 4, 2023

it may be worth clarifying precisely what we want to get from this context field (and these errors more generally).

Sure, I think clarification is needed too!

From the example that I gave, we appear to arguably have competing semantics currently:

[…]

In the SessionError, the context is something like a reason/general error message, whereas in the Pdu error it's more like the name of the pdu where the error occurred (before these changes), and now it's roughly name-of-pdu + function. Granted one could argue that for different error types context can reasonably have different meanings, but I wonder if doing so is making things unnecessarily complicated.

I would argue that the "invalid payload" in SessionError is not a good context string, because…

Personally when I hit any error, I'm looking for

  1. A way to efficiently, accurately identify where in the code the error came from
  2. Information about the cause of the error
  3. A message to display to the user in the UI (in cases where that's relevant) which doesn't expose irrelevant or security-sensitive information.

This is in line with my vision.

From those axioms I would imagine a "base class" that contains that information, and any "subclasses" (like PduError/PduErrorKind) being ways to efficiently communicate 2, and/or as a means for programmatically handling errors further up the call stack based on different semantics. (This is just a first pass at a "theory of error types", having read no other thinking on the topic in depth, so take it with that grain of salt. I'm also not suggesting we redo the entire error system at this point, just trying to clarify some of the thought process for myself).

Elaborating on what you said, here is how I imagined the error handling story in IronRDP: 1. is addressed by ironrdp_error::Error by holding the context field. 2. is addressed by the kind and/or source fields which contain the detailed information about the error. And 3. is addressed by matching on the kind and translating it into a user-facing error. We actually do that in ironrdp-web.

In addition to all that, ironrdp_error::Error is able to produce an error report, ironrdp_error::ErrorReport, which is walking the source errors and reporting all the details (including the various contexts attached to each error). This output is not exactly "user-facing", but should still be shown to the user so he can forward it to us and we are able to troubleshoot the problem, i.e., use it a bit like a stacktrace (addressing 1. and 2.). That’s how I’ve been troubleshooting a few bugs for IronRDP.

With the use of this function!() macro I'm essentially attempting to define these context fields as a means to achieve 1. I've been presuming that was the original intent, though feel free to correct me if I'm wrong.

This is the original intent. For PDUs, the context field is supposed to hold the name of the PDU being decoded or encoded, so it’s easy to find which part of the code is raising the error.

A key advantage I see to function!() macro as compared to your alternative is that it gets the job of 1 done without needing to know much about the structure of the underlying error system. This can be useful as a caller of the library, because I can just start using these error types and macros without needing to implement any other macros for my classes. For example in Teleport we're using PduResult and returning a lot of custom_err! (e.g. here). Arguably that's an abuse of PduResult semantics, but point is that it makes it easier to get 1 consistently wherever we are in the codebase.

However, PduErrorKind was not intended to be used pervasively this way.
(As you said, we are abusing it a bit.)
The intention was to define an *ErrorKind enum for the various parts of the application, which can be wrapped into a ironrdp_error::Error<*ErrorKind>. For channels, it could have been ChannelErrorKind, with ChannelError defined as ironrdp_error::Error<ChannelErrorKind>.
The side effect of this error kind being used pervasively is that we are now using the error macros at places where they’re not intended to be used to begin with, and we end up having no Self::NAME available for the context.

I’ve been wanting to revisit all that, but didn’t had the chance yet. I have a WIP branch with some experiments though. (That’s the branch where I originally experimented with type_name, although I didn’t keep that code as you can see.)

All that being said, I think I would like custom_err! macro to work anywhere in the codebase too, and the function! macro is likely one of the easiest way to achieve that. I think you convinced me to move forward with this approach. I may revisit things, but having it now wouldn’t create any hindrance. 🤔

Copy link
Member

@CBenoit CBenoit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving and enabling auto-merge

@CBenoit CBenoit enabled auto-merge (squash) December 4, 2023 22:25
@CBenoit CBenoit disabled auto-merge December 4, 2023 22:26
@CBenoit CBenoit enabled auto-merge (squash) December 4, 2023 22:26
@CBenoit CBenoit changed the title Adds a function! macro refactor: function! macro for error context Dec 4, 2023
Copy link

github-actions bot commented Dec 4, 2023

Coverage Report 🤖 ⚙️

Past:
Total lines: 26891
Covered lines: 16376 (60.90%)

New:
Total lines: 26898
Covered lines: 16379 (60.89%)

Diff: -0.00%

[this comment will be updated automatically]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants