Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41198: [C#] Fix concatenation of union arrays #41226

Merged
merged 2 commits into from
Apr 16, 2024

Conversation

adamreeve
Copy link
Contributor

@adamreeve adamreeve commented Apr 15, 2024

Rationale for this change

Fixes concatenation of union arrays.

What changes are included in this PR?

  • Re-enables union array concatenation tests that were disabled in GH-40517: [C#] Fix writing sliced arrays to IPC format #41197 after making union array comparisons more thorough in the ArrowReaderVerifier
  • Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
  • Fixes creating the type buffers for the array concatenation tests.

Are these changes tested?

Yes, this uses the existing ArrowArrayConcatenatorTests tests.

Are there any user-facing changes?

Yes, this is a user-facing bug fix.

Copy link

⚠️ GitHub issue #41198 has been automatically assigned in GitHub to PR creator.

@@ -604,10 +598,11 @@ public void Visit(UnionType type)

for (int j = 0; j < dataList.Count; j++)
{
byte index = (byte)Math.Max(j % 3, 1);
byte index = (byte)Math.Min(j % 3, 1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this was the intended behaviour, otherwise the index == 0 branch below is never used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the original code, the indexes would be 1, 1, 2, 1, 1, 2, etc.. With this change, the indexes would be 0, 1, 1, 0, 1, 1, etc.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 15, 2024
@@ -367,7 +367,7 @@ private ArrowBuffer ConcatenateUnionTypeBuffer()

foreach (ArrayData arrayData in _arrayDataList)
{
builder.Append(arrayData.Buffers[0]);
builder.Append(arrayData.Buffers[0].Span.Slice(arrayData.Offset, arrayData.Length));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the offset here but I haven't tested this works fully with sliced arrays, I'll follow up on this in #41164.

@CurtHagenlocher CurtHagenlocher merged commit e0f31aa into apache:main Apr 16, 2024
10 checks passed
@CurtHagenlocher CurtHagenlocher removed the awaiting committer review Awaiting committer review label Apr 16, 2024
@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Apr 16, 2024
@adamreeve adamreeve deleted the concat_unions branch April 16, 2024 03:02
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit e0f31aa.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

tolleybot pushed a commit to tmct/arrow that referenced this pull request May 2, 2024
### Rationale for this change

Fixes concatenation of union arrays.

### What changes are included in this PR?

* Re-enables union array concatenation tests that were disabled in apache#41197 after making union array comparisons more thorough in the `ArrowReaderVerifier`
* Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
* Fixes creating the type buffers for the array concatenation tests.

### Are these changes tested?

Yes, this uses the existing `ArrowArrayConcatenatorTests` tests.

### Are there any user-facing changes?

Yes, this is a user-facing bug fix.
* GitHub Issue: apache#41198

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 4, 2024
### Rationale for this change

Fixes concatenation of union arrays.

### What changes are included in this PR?

* Re-enables union array concatenation tests that were disabled in apache#41197 after making union array comparisons more thorough in the `ArrowReaderVerifier`
* Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
* Fixes creating the type buffers for the array concatenation tests.

### Are these changes tested?

Yes, this uses the existing `ArrowArrayConcatenatorTests` tests.

### Are there any user-facing changes?

Yes, this is a user-facing bug fix.
* GitHub Issue: apache#41198

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
### Rationale for this change

Fixes concatenation of union arrays.

### What changes are included in this PR?

* Re-enables union array concatenation tests that were disabled in apache#41197 after making union array comparisons more thorough in the `ArrowReaderVerifier`
* Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
* Fixes creating the type buffers for the array concatenation tests.

### Are these changes tested?

Yes, this uses the existing `ArrowArrayConcatenatorTests` tests.

### Are there any user-facing changes?

Yes, this is a user-facing bug fix.
* GitHub Issue: apache#41198

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this pull request May 8, 2024
### Rationale for this change

Fixes concatenation of union arrays.

### What changes are included in this PR?

* Re-enables union array concatenation tests that were disabled in apache#41197 after making union array comparisons more thorough in the `ArrowReaderVerifier`
* Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
* Fixes creating the type buffers for the array concatenation tests.

### Are these changes tested?

Yes, this uses the existing `ArrowArrayConcatenatorTests` tests.

### Are there any user-facing changes?

Yes, this is a user-facing bug fix.
* GitHub Issue: apache#41198

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
### Rationale for this change

Fixes concatenation of union arrays.

### What changes are included in this PR?

* Re-enables union array concatenation tests that were disabled in apache#41197 after making union array comparisons more thorough in the `ArrowReaderVerifier`
* Updates the union array concatenation logic to account for array lengths when concatenating the type and offset buffers, and fixes how the base offset is calculated.
* Fixes creating the type buffers for the array concatenation tests.

### Are these changes tested?

Yes, this uses the existing `ArrowArrayConcatenatorTests` tests.

### Are there any user-facing changes?

Yes, this is a user-facing bug fix.
* GitHub Issue: apache#41198

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants