Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C#] NullCount should be recomputed on-demand for sliced arrays #41136

Closed
adamreeve opened this issue Apr 10, 2024 · 3 comments
Closed

[C#] NullCount should be recomputed on-demand for sliced arrays #41136

adamreeve opened this issue Apr 10, 2024 · 3 comments

Comments

@adamreeve
Copy link
Contributor

Describe the enhancement requested

This is a prerequisite for writing sliced arrays in the IPC format (#40517), as the IPC format doesn't seem to allow for the null_count field to be unset or negative.

Currently, when slicing an array, the null count is set to -1 (RecalculateNullCount) to indicate that the null count is no longer valid.

Component(s)

C#

@adamreeve
Copy link
Contributor Author

take

CurtHagenlocher pushed a commit that referenced this issue Apr 12, 2024
### Rationale for this change

See #41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: #41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
@CurtHagenlocher CurtHagenlocher added this to the 17.0.0 milestone Apr 12, 2024
@CurtHagenlocher
Copy link
Contributor

Issue resolved by pull request 41144
#41144

vibhatha pushed a commit to vibhatha/arrow that referenced this issue Apr 15, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
@raulcd raulcd modified the milestones: 17.0.0, 16.1.0 Apr 29, 2024
@raulcd
Copy link
Member

raulcd commented Apr 29, 2024

I am moving this to 16.1.0 to be able to cleanly cherry-pick this: a6cdcd0

raulcd pushed a commit that referenced this issue Apr 29, 2024
### Rationale for this change

See #41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: #41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…apache#41144)

### Rationale for this change

See apache#41136. This is required for writing sliced arrays to IPC files and streams. It should also be generally useful too by allowing users to be able to rely on valid null count values for sliced arrays.

### What changes are included in this PR?

Adds a new `ArrayData.GetNullCount()` method which computes the actual null count if it is unknown, and changes most uses of `ArrayData.NullCount` to `GetNullCount()`.

### Are these changes tested?

Yes, I've extended the existing slicing tests to also verify the null count, and added new tests specifically for union arrays, which needed special handling.

### Are there any user-facing changes?

Yes, this is a user-facing behaviour change. I don't expect that it should break any existing code as users are unlikely to rely on the NullCount being -1.

* GitHub Issue: apache#41136

Authored-by: Adam Reeve <adreeve@gmail.com>
Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants