You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a dataset contains a nested field with associated metadata, writing the dataset is successful, but reading it can lead to errors like the one shown below:
In the current implementation of arrow-rs, StructArray::try_new internally compares the data types of each nested field, including their metadata. If the data type is the same but the metadata differs, an error is triggered:
if f.data_type() != a.data_type(){returnErr(ArrowError::InvalidArgumentError(format!("Incorrect datatype for StructArray field {:?}, expected {} got {}",
f.name(),
f.data_type(),
a.data_type())));}
I think this error might be related to arrow-rs, as it could potentially use the equals_datatype API instead of != to compare data types, though I’m not entirely certain. This is how arrow-rs currently handles data type comparisons, but it's unclear if Lance could modify its handling of field metadata to avoid this issue.
The text was updated successfully, but these errors were encountered:
I submitted a PR (#2949) that includes a test case demonstrating this issue and a quick fix. However, I believe the current solution is not ideal, as it allows the dataset to be read but loses the nested field metadata afterward. It would be great if we could find a better solution for this case.
However, I believe the current solution is not ideal, as it allows the dataset to be read but loses the nested field metadata afterward. It would be great if we could find a better solution for this case.
I've added a PR that should keep the nested field metadata. I moved your test to python as it's a bit easier to maintain there but the rust test was passing before I did.
If a dataset contains a nested field with associated metadata, writing the dataset is successful, but reading it can lead to errors like the one shown below:
This error occurs when decoding a struct, as Arrow's
StructArray::try_new
method is invoked:In the current implementation of arrow-rs,
StructArray::try_new
internally compares the data types of each nested field, including their metadata. If the data type is the same but the metadata differs, an error is triggered:I think this error might be related to
arrow-rs
, as it could potentially use theequals_datatype
API instead of!=
to compare data types, though I’m not entirely certain. This is howarrow-rs
currently handles data type comparisons, but it's unclear if Lance could modify its handling of field metadata to avoid this issue.The text was updated successfully, but these errors were encountered: