Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] SplitAndTransfer throws for (0,0) if vector empty #30866

Closed
asfimport opened this issue Jan 19, 2022 · 3 comments
Closed

[Java] SplitAndTransfer throws for (0,0) if vector empty #30866

asfimport opened this issue Jan 19, 2022 · 3 comments

Comments

@asfimport
Copy link
Collaborator

I've hit a bug where splitAndTransfer on vectors throws if the vector is completely empty and the offset buffer is empty.

An easy repro is:


        BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE);
        ListVector listVector = ListVector.empty("listVector", allocator);
        listVector.getTransferPair(listVector.getAllocator()).splitAndTransfer(0, 0);

This results in the following stacktrace:


java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
	at io.netty.buffer.ArrowBuf.checkIndexD(ArrowBuf.java:335)
	at io.netty.buffer.ArrowBuf.chk(ArrowBuf.java:322)
	at io.netty.buffer.ArrowBuf.getInt(ArrowBuf.java:441)
	at org.apache.arrow.vector.complex.ListVector$TransferImpl.splitAndTransfer(ListVector.java:484)

In production we hit this when calling VectorSchemaRoot.slice. The schema root contains a ListVector with a VarCharVector value vector. The list vector isn't empty, but all the strings in the var char vector are. splitAndTransfer on the list vector works, but then when underlying var char vector is split we get the same exception:


java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
	at io.netty.buffer.ArrowBuf.checkIndexD(ArrowBuf.java:335)
	at io.netty.buffer.ArrowBuf.chk(ArrowBuf.java:322)
	at io.netty.buffer.ArrowBuf.getInt(ArrowBuf.java:441)
	at org.apache.arrow.vector.BaseVariableWidthVector.splitAndTransferOffsetBuffer(BaseVariableWidthVector.java:728)
	at org.apache.arrow.vector.BaseVariableWidthVector.splitAndTransferTo(BaseVariableWidthVector.java:712)
	at org.apache.arrow.vector.VarCharVector$TransferImpl.splitAndTransfer(VarCharVector.java:321)
	at org.apache.arrow.vector.complex.ListVector$TransferImpl.splitAndTransfer(ListVector.java:496)
	at org.apache.arrow.vector.VectorSchemaRoot.lambda$slice$1(VectorSchemaRoot.java:308)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
	at org.apache.arrow.vector.VectorSchemaRoot.slice(VectorSchemaRoot.java:310)

Reporter: David Vogelbacher

PRs and other links:

Note: This issue was originally created as ARROW-15382. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Frank Wong / @wzx140:
The problem seems to affect BaseLargeVariableWidthVector, BaseVariableWidthVector, ListVector and MapVector.

@asfimport
Copy link
Collaborator Author

Todd Farmer / @toddfarmer:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

FiV0 added a commit to FiV0/xtdb that referenced this issue Apr 2, 2024
FiV0 added a commit to FiV0/xtdb that referenced this issue Apr 2, 2024
FiV0 added a commit to FiV0/xtdb that referenced this issue Apr 2, 2024
FiV0 added a commit to FiV0/xtdb that referenced this issue Apr 2, 2024
lidavidm pushed a commit that referenced this issue Apr 9, 2024
#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of #12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: #30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
@lidavidm lidavidm changed the title SplitAndTransfer throws for (0,0) if vector empty [Java] SplitAndTransfer throws for (0,0) if vector empty Apr 9, 2024
@lidavidm lidavidm added this to the 17.0.0 milestone Apr 9, 2024
@lidavidm
Copy link
Member

lidavidm commented Apr 9, 2024

Issue resolved by pull request 41066
#41066

@lidavidm lidavidm closed this as completed Apr 9, 2024
verma-kartik pushed a commit to verma-kartik/arrow that referenced this issue Apr 11, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue Apr 15, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
FiV0 added a commit to xtdb/arrow that referenced this issue May 24, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
jarohen pushed a commit to xtdb/arrow that referenced this issue Jul 17, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
lriggs pushed a commit to lriggs/arrow that referenced this issue Sep 3, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
lriggs added a commit to dremio/arrow that referenced this issue Sep 4, 2024
…ixes. (#81)

* apacheGH-30866: [Java] fix SplitAndTransfer throws for (0,0) if vector empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>

* apacheGH-43463: [C++][Gandiva] Always use gdv_function_stubs.h in context_helper.cc (apache#43464)

### Rationale for this change

`gdv_function_stubs.h` has declarations of functions in `context_helper.cc`.

If we don't include `gdv_function_stubs.h`, it causes attribution mismatch error with unity build.

### What changes are included in this PR?

Always include `gdv_function_stubs.h` in `context_helper.cc`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#43463

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>

* apacheGH-43119: [CI][Packaging] Update manylinux 2014 CentOS repos that have been deprecated (apache#43121)

### Rationale for this change

Jobs are failing to find mirrorlist.centos.org

### What changes are included in this PR?

Updating repos based on solution from: apache#43119 (comment)

### Are these changes tested?

Via archery

### Are there any user-facing changes?
No
* GitHub Issue: apache#43119

Lead-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>

* Update macos deployment target to 12 to match build machine.

* apacheGH-43400: [C++] Ensure using bundled GoogleTest when we use bundled GoogleTest (apache#43465)

### Rationale for this change

If we use bundled GoogleTest and system other dependencies such as Boost, our include path options may be:

* `-isystem /opt/homebrew/include` (for Boost)
* `-isystem build_dir/_deps/googletest-src/googletest` (for bundled GoogleTest)
* `-isystem build_dir/_deps/googletest-src/googlemock` (for bundled GoogleTest)

With this order, GoogleTest headers in `/opt/homebrew/include/` are used with bundled GoogleTest. It may cause link errors.

### What changes are included in this PR?

This change introduces a new CMake target
`arrow::GTest::gtest_headers` that has include paths for bundled GoogleTest. And it's always used as the first link library of all test program. With this change, our include path options are:

* `-isystem build_dir/_deps/googletest-src/googletest` (for bundled GoogleTest)
* `-isystem build_dir/_deps/googletest-src/googlemock` (for bundled GoogleTest)
* `-isystem /opt/homebrew/include` (for Boost)

With this order, we can always use our bundled GoogleTest.

`arrow::GTest::gtest_headers` is defined only when we use bundled GoogleTest. So this doesn't change the system GoogleTest case.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#43400

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>

---------

Signed-off-by: David Li <li.davidm96@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
Co-authored-by: Finn Völkel <FiV0@users.noreply.github.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
lriggs pushed a commit to lriggs/arrow that referenced this issue Sep 5, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
lriggs pushed a commit to lriggs/arrow that referenced this issue Sep 6, 2024
…r empty (apache#41066)

This is addresses https://issues.apache.org/jira/browse/ARROW-15382 and is reopening of apache#12250 (which I asked to be reopened).

I tried to address all the comments from the previous discussion, added some more tests and fixed an issue in the old commit.
* GitHub Issue: apache#30866

Authored-by: Finn Völkel <finn.volkel@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants