Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Union types in StructBuilder #6349

Open
V0ldek opened this issue Sep 2, 2024 · 0 comments
Open

Support Union types in StructBuilder #6349

V0ldek opened this issue Sep 2, 2024 · 0 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@V0ldek
Copy link
Contributor

V0ldek commented Sep 2, 2024

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I'm trying to use StructBuilder to create a struct whose one field is of a Union type. This is currently unsupported, see this small repro (crate version v52.2.0):

use arrow::{
    array::StructBuilder,
    datatypes::{DataType, Field, Fields, UnionFields, UnionMode},
};

fn main() {
    let _builder = StructBuilder::from_fields(
        get_struct_fields(),
        3,
    );
}

fn get_struct_fields() -> Fields {
    use arrow::datatypes::{DataType, Field, Fields};
    Fields::from(vec![
        Field::new("name", DataType::Utf8, false),
        Field::new("value", get_union_type(), false),
    ])
}

fn get_union_type() -> DataType {
    let fields = vec![
        Field::new("Integer", DataType::Int32, false),
        Field::new("Float", DataType::Float32, false),
    ];
    DataType::Union(UnionFields::new(vec![0, 1], fields), UnionMode::Sparse)
}

This results in a panic from builder/struct_builder.rs:278:14:

Data type Union([(0, Field { name: "Integer", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (1, Field { name: "Float", data_type: Float32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} })], Sparse) is not currently supported

The particular layout of the union doesn't matter, neither does whether it's sparse or dense.

Describe the solution you'd like
I'd like to use StructBuilder with a Union field the same way it works with other field types. In particular, I should be able to retrieve a UnionBuilder for a field:

    let mut builder = StructBuilder::from_fields(
        get_struct_fields(),
        3,
    );
    let union_field = builder.field_builder::<UnionBuilder>(1);

Describe alternatives you've considered
I don't actually know how to work around this. It'd be appreciated if someone could show me how to build a struct array with a schema like this without using the builder, so I can unblock myself.

Additional context
As far as I understand the main issue is that UnionBuilder does not implement the ArrayBuilder trait, which is required for the dynamic API and internals of StructBuilder. This is because ArrayBuilder requires a finish(&mut self) -> ArrayRef, and the UnionBuilder has a build(self) -> Result<UnionArray, ArrowError> (consumes the builder AND can fail).

Changing self to &mut self shouldn't be too hard, but the Result semantics are rather hard. I'm not sure what the implementation path would be here.

@V0ldek V0ldek added the enhancement Any new improvement worthy of a entry in the changelog label Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

1 participant