You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unable to correctly write nested structs where a struct is non-nullable.
I've noticed this behaviour before, but couldn't quite reproduce it easily.
To Reproduce
If we have the below test case (in parquet/src/arrow/arrow_writer.rs:
#[test]fnarrow_writer_complex_mixed(){// define schemalet offset_field = Field::new("offset",DataType::Int32,true);let partition_field = Field::new("partition",DataType::Int64,true);let topic_field = Field::new("topic",DataType::Utf8,true);let schema = Schema::new(vec![Field::new("some_nested_object",DataType::Struct(vec![offset_field.clone(),
partition_field.clone(),
topic_field.clone()]),false),// NOTE: this being false results in the array not being written correctly]);// create some datalet offset = Int32Array::from(vec![1,2,3,4,5]);let partition = Int64Array::from(vec![Some(1),None,None,Some(4),Some(5)]);let topic = StringArray::from(vec![Some("A"),None,Some("A"),Some(""),None]);let some_nested_object = StructArray::from(vec![(offset_field,Arc::new(offset)asArrayRef),(partition_field,Arc::new(partition)asArrayRef),(topic_field,Arc::new(topic)asArrayRef),]);// build a record batchlet batch = RecordBatch::try_new(Arc::new(schema),vec![Arc::new(some_nested_object)],).unwrap();roundtrip("test_arrow_writer_complex_mixed.parquet", batch);}
We get a failure:
thread 'arrow::arrow_writer::tests::arrow_writer_complex_mixed' panicked at 'assertion failed: `(left == right)`
left: `1`,
right: `0`', parquet/src/util/bit_util.rs:332:9
test arrow::arrow_writer::tests::arrow_writer_complex_mixed ... FAILED
When the struct is nullable, the file is written correctly.
Expected behavior
The batch should be written without errors.
Additional context
From inspecting the levels that are generated for the passing and failing scenarios, they look identical (https://www.diffchecker.com/89qWByeI). It looks like the bug is with how levels of non-null structs are generated.
The text was updated successfully, but these errors were encountered:
Describe the bug
Unable to correctly write nested structs where a struct is non-nullable.
I've noticed this behaviour before, but couldn't quite reproduce it easily.
To Reproduce
If we have the below test case (in
parquet/src/arrow/arrow_writer.rs
:We get a failure:
When the struct is nullable, the file is written correctly.
Expected behavior
The batch should be written without errors.
Additional context
From inspecting the levels that are generated for the passing and failing scenarios, they look identical (https://www.diffchecker.com/89qWByeI). It looks like the bug is with how levels of non-null structs are generated.
The text was updated successfully, but these errors were encountered: