Skip to content

Conversation

@nathaniel-d-ef
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

The arrow-avro crate currently uses ArrowError throughout. This lacks the level of precision other crates in the project, such as Parquet, have.

What changes are included in this PR?

  • A new AvroError enum
  • Application of AvroError on all internal methods where ArrowError was previously used. Errors on pub methods at the API boundary remain as ArrowError. A Result utility on the AvroError allows for Result<T, AvroError> to be written as Result.

Are these changes tested?

No new functionality has been introduced, all existing tests are passing.

Are there any user-facing changes?

There shouldn't be - the API signatures remain the same.

@github-actions github-actions bot added arrow Changes to the arrow crate arrow-avro arrow-avro crate labels Oct 31, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nathaniel-d-ef -- I think this makes sense to me, but it is a breaking API change and thus we will have to wait until the next major release in a few months:

FYI @jecsand838 and @mbrobbel

EOF(String),
/// Arrow error.
/// Returned when reading into arrow or writing from arrow.
ArrowError(String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this instead keep the actual Arrow error rather than converting directly into a string?

Suggested change
ArrowError(String),
ArrowError(Box<ArrowError>)),

I realize that is what ParquetError::ArrowError does the same thing -- but I think we might want to change that Parquet error as well

}
}

impl From<cell::BorrowMutError> for AvroError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a fairly specific conversion -- maybe it would be simpler to annotate the locations where this happens with map_err(|e| AvroError::From(Box::new(e)) 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, my mistake. This is unnecessary, I removed.

}
return out
.write_all(&src_be[extra..])
.map_err(|e| ArrowError::IoError(format!("write decimal fixed: {e}"), e));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does appear to lose some error context. Would it be better to keep the information that this came from write decimal fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. I adjusted this to use the General error and pass the contextual info.

@alamb alamb added api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version labels Oct 31, 2025
@jecsand838
Copy link
Contributor

jecsand838 commented Oct 31, 2025

@nathaniel-d-ef I'll do a deeper review tonight, but if we plan to go this direction, should we also remove the AvroError variant from https://github.com/apache/arrow-rs/blob/main/arrow-schema/src/error.rs?

I never got around to wiring that up before public release. But the original intent was to align arrow-avro with arrow-csv and arrow-json.

CC: @alamb

Comment on lines 128 to 132
impl From<ArrowError> for AvroError {
fn from(e: ArrowError) -> AvroError {
AvroError::External(Box::new(e))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathaniel-d-ef I'd consider doing something like this:

Suggested change
impl From<ArrowError> for AvroError {
fn from(e: ArrowError) -> AvroError {
AvroError::External(Box::new(e))
}
}
pub enum AvroError {
// ...
ArrowError(Box<ArrowError>),
// ...
}
impl From<ArrowError> for AvroError {
fn from(e: ArrowError) -> Self {
AvroError::ArrowError(Box::new(e))
}
}
impl std::error::Error for AvroError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
match self {
AvroError::External(e) => Some(e.as_ref()),
AvroError::ArrowError(e) => Some(e.as_ref()),
_ => None,
}
}
}

Comment on lines 149 to 153
impl From<AvroError> for ArrowError {
fn from(p: AvroError) -> Self {
Self::AvroError(format!("{p}"))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then down here, you can also do this:

Suggested change
impl From<AvroError> for ArrowError {
fn from(p: AvroError) -> Self {
Self::AvroError(format!("{p}"))
}
}
impl From<AvroError> for ArrowError {
fn from(e: AvroError) -> Self {
match e {
AvroError::External(inner) => ArrowError::from_external_error(inner),
AvroError::ArrowError(inner) => ArrowError::from_external_error(inner),
other => ArrowError::AvroError(other.to_string()),
}
}
}

@mbrobbel mbrobbel added this to the 58.0.0 milestone Nov 3, 2025
@nathaniel-d-ef
Copy link
Contributor Author

@nathaniel-d-ef I'll do a deeper review tonight, but if we plan to go this direction, should we also remove the AvroError variant from https://github.com/apache/arrow-rs/blob/main/arrow-schema/src/error.rs?

I never got around to wiring that up before public release. But the original intent was to align arrow-avro with arrow-csv and arrow-json.

CC: @alamb

I see what you mean. I don't feel strongly either way. On the whole we're probably unlikely to use all of the variants here, so if it's preferable to roll this back a bit and implement the ArrowError::AvroError instead, I'm happy to go that direction. I mainly didn't want it stuck somewhere in the middle. Your comment on the issue is a good one, perhaps Parquet is more of the odd one out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API arrow Changes to the arrow crate arrow-avro arrow-avro crate next-major-release the PR has API changes and it waiting on the next major version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AvroError enum for arrow-avro crate

4 participants