Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ruby] Unexpected behavior when building rows with an empty list #44742

Closed
fpacanowski opened this issue Nov 15, 2024 · 1 comment
Closed

[Ruby] Unexpected behavior when building rows with an empty list #44742

fpacanowski opened this issue Nov 15, 2024 · 1 comment

Comments

@fpacanowski
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

The issue occurs when there's a list of structs defined in schema. Here's a minimal example:

require 'arrow'
require 'parquet'

schema = Arrow::Schema.new(
  [
   Arrow::Field.new("structs", Arrow::ListDataType.new(
     Arrow::StructDataType.new([
       Arrow::Field.new("foo", :int64),
       Arrow::Field.new("bar", :int64)
     ])
   ))
 ]
)

# This works.
table = Arrow::RecordBatchBuilder.build(schema, [
  { structs: [{foo: 1, bar: 2}, {foo: 3, bar: 4}] },
  { structs: [{foo: 5, bar: 6}] }
]).to_table
table.save('file.parquet')

# This errors out.
table = Arrow::RecordBatchBuilder.build(schema, [
  { structs: [] },
  { structs: [] },
]).to_table
table.save('file.parquet')

I expected the second invocation to produce a table with two rows with empty lists in structs column. Instead I got the following error:

/home/filip/.rvm/gems/ruby-3.3.0/gems/gobject-introspection-4.2.4/lib/gobject-introspection/loader.rb:715:in `invoke': [parquet][arrow][file-writer][write-table]: Invalid: Column 0: In chunk 0: Invalid: List child array invalid: Invalid: Struct child array #0 has length smaller than expected for struct array (0 < 2) (Arrow::Error::Invalid)
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/gobject-introspection-4.2.4/lib/gobject-introspection/loader.rb:715:in `invoke'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/gobject-introspection-4.2.4/lib/gobject-introspection/loader.rb:583:in `write_table'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-parquet-17.0.0/lib/parquet/arrow-table-savable.rb:41:in `block (2 levels) in save_as_parquet'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/block-closable.rb:25:in `open'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-parquet-17.0.0/lib/parquet/arrow-table-savable.rb:38:in `block in save_as_parquet'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/block-closable.rb:25:in `open'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/table-saver.rb:115:in `open_raw_output_stream'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-parquet-17.0.0/lib/parquet/arrow-table-savable.rb:37:in `save_as_parquet'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/table-saver.rb:77:in `save_to_file'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/table-saver.rb:53:in `save'
	from /home/filip/.rvm/gems/ruby-3.3.0/gems/red-arrow-17.0.0/lib/arrow/table.rb:447:in `save'
	from repro.rb:27:in `<main>'

This is running on version 17.0.0 of red-arrow and red-parquet.

Component(s)

Ruby

@kou kou changed the title Unexpected behavior when building rows with an empty list [Ruby] Unexpected behavior when building rows with an empty list Nov 16, 2024
kou added a commit to kou/arrow that referenced this issue Nov 18, 2024
kou added a commit that referenced this issue Nov 19, 2024
#44763)

### Rationale for this change

This codes add a list value but no struct value isn't added:  

```ruby

require "arrow"

schema = Arrow::Schema.new(
  [
   Arrow::Field.new("structs", Arrow::ListDataType.new(
     Arrow::StructDataType.new([
       Arrow::Field.new("foo", :int64),
       Arrow::Field.new("bar", :int64)
     ])
   ))
 ]
)

Arrow::RecordBatchBuilder.build(schema, [{structs: []}])
```

### What changes are included in this PR?

Don't add a list value.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: #44742

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 19.0.0 milestone Nov 19, 2024
@kou
Copy link
Member

kou commented Nov 19, 2024

Issue resolved by pull request 44763
#44763

@kou kou closed this as completed Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants