Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-4539: [Java] Fix child vector count for lists. #3625

Merged
merged 2 commits into from
Feb 13, 2019

Conversation

praveenbingo
Copy link
Contributor

  • Child vector count was not set correctly for lists. Fixed to use the right count.

- Child vector count was not set correctly for lists. Fixed to use the right count.
@praveenbingo
Copy link
Contributor Author

@siddharthteotia - could you please review..

@@ -0,0 +1,65 @@
package org.apache.arrow.vector;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This files needs an Apache license header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xhochy yup missed it..added them now.

@codecov-io
Copy link

Codecov Report

Merging #3625 into master will increase coverage by 1.23%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #3625      +/-   ##
=========================================
+ Coverage   87.76%     89%   +1.23%     
=========================================
  Files         673     447     -226     
  Lines       82839   53792   -29047     
  Branches     1069       0    -1069     
=========================================
- Hits        72704   47877   -24827     
+ Misses      10020    5915    -4105     
+ Partials      115       0     -115
Impacted Files Coverage Δ
cpp/src/arrow/csv/reader.h 0% <0%> (-100%) ⬇️
cpp/src/arrow/array/builder_union.h 0% <0%> (-100%) ⬇️
cpp/src/arrow/array/builder_union.cc 4.16% <0%> (-95.84%) ⬇️
cpp/src/arrow/csv/reader.cc 0.53% <0%> (-90.38%) ⬇️
cpp/src/arrow/adapters/orc/adapter.cc 0.26% <0%> (-73.04%) ⬇️
cpp/src/arrow/csv/options.h 66.66% <0%> (-33.34%) ⬇️
cpp/src/parquet/column_scanner.cc 59.09% <0%> (-22.73%) ⬇️
cpp/src/arrow/vendored/xxhash/xxhash.c 51.26% <0%> (-22.34%) ⬇️
cpp/src/arrow/memory_pool.cc 68.8% <0%> (-16.52%) ⬇️
cpp/src/arrow/io/interfaces.cc 40% <0%> (-15%) ⬇️
... and 277 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c279eb...09bf45c. Read the comment docs.

@@ -744,7 +744,7 @@ public void setValueCount(int valueCount) {
}
/* valueCount for the data vector is the current end offset */
final int childValueCount = (valueCount == 0) ? 0 :
offsetBuffer.getInt(valueCount * OFFSET_WIDTH);
offsetBuffer.getInt(lastSet * OFFSET_WIDTH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So were we always setting the value count for child vector incorrectly? I am not clear what the problem is. Is there a special case we have run into?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I ran into this over on #3294, when I tried to convert a string array from a JDBC field into a VarChar ListVector. If one of the values in the JDBC string array was null, I couldn't read the value after it from the VarChar ListVector (the following value came back as an empty string).

@praveenbingo was kind enough to put together this PR to fix the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, @siddharthteotia - we were not correctly handling lists that could have null values (setting child vector count using valuecount will not handle cases where valuecount < lastSet i.e. presence of nulls).

It does not affect List of FixedWidthVectors and only VariableWidthVectors.

Please let me know if you would need more information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@siddharthteotia siddharthteotia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@siddharthteotia siddharthteotia merged commit 51b5143 into apache:master Feb 13, 2019
tanyaschlusser pushed a commit to tanyaschlusser/arrow that referenced this pull request Feb 21, 2019
* ARROW-4539: [Java] Fix child vector count for lists.

- Child vector count was not set correctly for lists. Fixed to use the right count.

* ARROW-4539: [Java] Add license header.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants