Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error type handling #543

Closed
bruceg opened this issue Jun 26, 2019 · 24 comments
Closed

Improve error type handling #543

bruceg opened this issue Jun 26, 2019 · 24 comments
Labels
type: tech debt A code change that does not add user value.

Comments

@bruceg
Copy link
Member

bruceg commented Jun 26, 2019

Many of Vector's methods use a Result<_, String> return type. This is adequate for textual errors, but it makes it impossible for users to pattern match on the error type as well as dropping any cause information encapsulated by some error types. We should switch to proper error types (similar to std::io::Error) probably aided by the failure crate.

References:

@binarylogic binarylogic added the type: tech debt A code change that does not add user value. label Jun 26, 2019
@LucioFranco
Copy link
Contributor

We most likely want to use Box<dyn std::error::Error + Send + Sync + 'static> as the error type throughout the whole crate. This would allow us to downcast_ref. Leaf level errors would provide an enum and implement std::error::Error.

@bruceg
Copy link
Member Author

bruceg commented Jun 28, 2019

Would using a Box for every error return be too expensive?

I suggested the failure crate as it helps wrap many of the conversion issues and provides custom derives for error (Fail) types, as well as catch the back trace. This may be overkill, but it's a light library and we're already pulling it through one of the dependencies.

FTR I'm not stuck on using failure, it would just be good to avoid rolling our own solution given how many other quality solutions exist to either derive complete error types or handle error return conversions and encapsulation.

@LucioFranco
Copy link
Contributor

It shouldn't be the box should only happen at the leaf level and get passed up the tree. Generally, you're able to use bounds like SinkError: Into<Box<std::Error...> where when this is already a box the into is the identity transform and thus does nothing. I believe failure is actually outdated at this point. Pretty much all of rusts error handling will move to this box style in the future. With leaf level errors being an enum that impls std::error::Error.

@lukesteensen
Copy link
Member

This is definitely something we've punted on for a little too long. That said, the error handling situation is a little in flux right now (see here and here) and I'm not sure there's an obviously correct direction for us to take.

My understanding is that failure has been "downgraded" to something of an experiment, so I'm not sure we'd want to go that route. If we do bring in a crate, it seems like snafu is probably the best maintained. I think the main benefits of an external crate seem to be convenience of defining custom error types and the ability to easily attach backtraces and context.

As a first step we should probably pick a component or two currently using String errors and work through converting them to real error types as simply as possible. We use errors differently in different components (e.g. collected and presented to the user during topology building, quickly caught and logged with in each task for source/transforms/sinks, etc), so getting a representative sample would be good. Then we can try a crate or two to see what feels best for our needs. I imagine we won't need much more than a convenient way to define our types and implement std::error::Error.

@bruceg
Copy link
Member Author

bruceg commented Aug 16, 2019

So now, one error return has been converted to an actual error type. Where should this issue go from here?

@lukesteensen
Copy link
Member

I would say we should take a quick inventory of components still using String errors and either add a checklist to this issue or break them out into individual issues so we can track overall progress (probably just a checklist here).

@bruceg
Copy link
Member Author

bruceg commented Aug 20, 2019

A quick grep shows at least the following modules:

  • src/buffers/disk.rs
  • src/region.rs
  • src/sinks/aws_cloudwatch_logs/mod.rs
  • src/sinks/aws_cloudwatch_metrics.rs
  • src/sinks/aws_kinesis_streams.rs
  • src/sinks/aws_s3.rs
  • src/sinks/blackhole.rs
  • src/sinks/clickhouse.rs
  • src/sinks/console.rs
  • src/sinks/elasticsearch.rs
  • src/sinks/http.rs
  • src/sinks/kafka.rs
  • src/sinks/mod.rs
  • src/sinks/prometheus.rs
  • src/sinks/splunk_hec.rs
  • src/sinks/tcp.rs
  • src/sinks/util/http.rs
  • src/sinks/vector.rs
  • src/sources/file.rs
  • src/sources/statsd/mod.rs
  • src/sources/statsd/parser.rs
  • src/sources/stdin.rs
  • src/sources/syslog.rs
  • src/sources/tcp.rs
  • src/sources/udp.rs
  • src/sources/util/tcp.rs
  • src/sources/vector.rs
  • src/topology/config/mod.rs
  • src/transforms/add_fields.rs
  • src/transforms/coercer.rs
  • src/transforms/field_filter.rs
  • src/transforms/grok_parser.rs
  • src/transforms/json_parser.rs
  • src/transforms/log_to_metric.rs
  • src/transforms/lua.rs
  • src/transforms/regex_parser.rs
  • src/transforms/remove_fields.rs
  • src/transforms/sampler.rs
  • src/transforms/tokenizer.rs
  • tests/crash.rs

@bruceg
Copy link
Member Author

bruceg commented Aug 28, 2019

Do we want this as separate PR for each refactoring (ex all the transforms have to be done at once) or one big PR when complete?

@LucioFranco
Copy link
Contributor

@bruceg one commit per sink/source/transform works and then bunch them into their own PR.

@bruceg
Copy link
Member Author

bruceg commented Aug 28, 2019

I've done all the (for example) transforms as one commit. Are you saying you would rather break them into separate commits per file (which won't actually compile until the last commit) or am I misunderstanding?

@LucioFranco
Copy link
Contributor

@bruceg honestly, whatever is easier for you, I'm happy to review it either way.

@lukesteensen
Copy link
Member

Yeah, please feel free to break it up however is easiest for you.

@bruceg
Copy link
Member Author

bruceg commented Sep 17, 2019

AFAICT there is one last module that uses String error types, and that is the configuration parsing framework.

@lukesteensen
Copy link
Member

Do you mean just here? That one is so small and isolated I don't think it's worth the effort to do anything fancier.

@bruceg
Copy link
Member Author

bruceg commented Sep 19, 2019

No, I mean the whole stack starting (roughly) here. There are a lot of routines that return Result<..., Vec<String>> that I didn't catch before.

@lukesteensen
Copy link
Member

I see, those evaded my grep as well. I wouldn't say they're high priority, and they might be a little different since we're collecting multiple errors. I'd say we can take a quick stab at them, but be ready to punt if it becomes much of a burden. There's some value to be had there, but not a ton.

@LucioFranco
Copy link
Contributor

We can also change the signature to Result<..., Vec<crate::Error>>.

@binarylogic
Copy link
Contributor

@bruceg just checking in on this. Do you mind letting us know what's left? So we can document it here. If nothing is left, feel free to close.

@bruceg
Copy link
Member Author

bruceg commented Oct 7, 2019

The whole configuration parsing stack still uses String error returns, mostly wrapped into Vec<String>. I have started working on this but pushed it aside to work on higher priority issues (TLS mostly). It was turning out to be less of a straightforward conversion that the sinks and sources were.

@JeanMertz
Copy link
Contributor

@lukesteensen @bruceg would it make sense to introduce a custom ConfigError(Vec<Box<dyn Error>>) struct which impl's Error and Display and allows pushing more errors onto its list?

That way, we can still have all config-related functions return the default boxed error trait object, but we can either a) downcast at the leaf to capture all individual errors, or b) have the Display impl print the individual errors however we want them to be represented.

@bruceg
Copy link
Member Author

bruceg commented Nov 4, 2020

I think the preferred path would be to introduce a enum ConfigError or equivalent and turn all those returns into Result<…, Vec<ConfigError>>. At some layer, they may have to be boxed, but at the point they are emitted they can be enumerated and structured. Wrapping the error vector in a newtype as you describe would be a useful additional abstraction, but since they are all config errors, they can stay enumerated there.

@jszwedko
Copy link
Member

@lukesteensen @bruceg is this still relevant? I think we do use Box<dyn std::error::Error + Send + Sync + 'static> in most places now?

@lukesteensen
Copy link
Member

I'm not sure that it's very high priority, but there is definitely still work that we could do to improve our error handling story. This initial pass at this used snafu, but there have been quite a few uses of anyhow/thiserror that have crept into the codebase over time, as well as a few areas where I think we're still using String as our error type. There are also some newer crates like color-eyre that have some nice integrations with tracing spans that could be useful for us. Overall, I think it would definitely be of some value to review what we're doing and choose one consistent pattern.

@jszwedko
Copy link
Member

jszwedko commented May 4, 2023

Closing this since we wouldn't pick it up for its own sake but only if we needed to better handle errors for some other reason.

@jszwedko jszwedko closed this as completed May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: tech debt A code change that does not add user value.
Projects
None yet
Development

No branches or pull requests

6 participants