Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder #12080

Merged
merged 2 commits into from
Jan 13, 2022

Conversation

jon-wei
Copy link
Contributor

@jon-wei jon-wei commented Dec 18, 2021

This PR adjusts SchemaRegistryBasedAvroBytesDecoder to throw ParseException instead of RE when failure to retrieve a schema occurs.

Throwing an RE can be problematic when this decoder is used in streaming ingestion: if a corrupt record enters the stream with an invalid schema ID, the user cannot easily recover from this as the streaming ingestion tasks will continue to fail and attempt to read the corrupt record.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jon-wei jon-wei changed the title Add option to throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder Throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder Dec 21, 2021
Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this change makes sense 👍

my intentions were good when i made this change originally - a bad configuration non-intuitively seemed to run tasks that just couldn't process the data and threw everything out, so if the configuration would never work it made sense to fail fast.

However, I didn't consider the case of a bad record, where throwing a RE means there is no way to process such streams without skipping the offset past such a record, while a parse exception handles this case nicely.

@jon-wei jon-wei merged commit 74c876e into apache:master Jan 13, 2022
@abhishekagarwal87 abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022
sachinsagare pushed a commit to sachinsagare/druid that referenced this pull request Nov 3, 2022
…roBytesDecoder (apache#12080)

* Add option to throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder

* Remove option

(cherry picked from commit 74c876e)
gianm added a commit to gianm/druid that referenced this pull request Nov 21, 2022
…stry.

The change in apache#12080 lost the original exception context. This patch
adds it back.
gianm added a commit to gianm/druid that referenced this pull request Nov 21, 2022
…stry.

The change in apache#12080 lost the original exception context. This patch
adds it back.
gianm added a commit that referenced this pull request Nov 22, 2022
…stry. (#13403)

* Attach IO error to parse error when we can't contact Avro schema registry.

The change in #12080 lost the original exception context. This patch
adds it back.

* Add hamcrest-core.

* Fix format string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants