Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37728: [Java] Add methods to get an Iterable for a ValueVector #41895

Merged
merged 7 commits into from
Jun 18, 2024

Conversation

normanj-bitquill
Copy link
Contributor

@normanj-bitquill normanj-bitquill commented May 30, 2024

Rationale for this change

Simplify validating the values in a ValueVector in unit tests.

What changes are included in this PR?

Methods for creating an Iterable and Iterator for a ValueVector. Also updated some unit tests to use the new methods.

Are these changes tested?

Some unit tests were updated.

Are there any user-facing changes?

The new methods are publicly available in the ValueVectorUtility class.

Copy link

⚠️ GitHub issue #37728 has been automatically assigned in GitHub to PR creator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose eventually we can add typed/generic versions? (I suppose we'd have to write out overloads per class so that IntVector returns Iterator<Integer> etc. Or else accept Class<T> and assert at runtime that the vector is of the right type.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm We can. I view this as functionality that is meant to only be used by the unit tests. It could be enhanced to support the Iterator type matching the ValueVector type. For that I would favour adding the type information to the ValueVector classes and using that to determine the type of the Iterator.

The Iterator and Iterable are publicly available. Is there anything more that can be done to discourage their use in Arrow applications?

Are you looking for properly typed Iterators in this PR or should that wait for a future PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean for this to only be available in unit tests then it should be in a distinct module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm I have updated this. There is a new interface that a ValueVector can implement. The new interface provides methods for getting an Iterator or Iterable. The new interface also allows the value type to be specified so that the Iterator will be properly typed.

<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest</artifactId>
<version>2.2</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set the version in dependencyManagement at the root level? I believe a few other modules use hamcrest too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels May 31, 2024
@vibhatha
Copy link
Collaborator

@lidavidm should we upgrade to Junit5?

@lidavidm
Copy link
Member

We can do that separately

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting merge Awaiting merge awaiting changes Awaiting changes labels Jun 3, 2024
*
* @return number of values in the vector
*/
int getValueCount();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this extend ValueVector to avoid the repeated declarations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidavidm Updated.

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Jun 9, 2024
@normanj-bitquill normanj-bitquill force-pushed the 37728-valuevector-iterable branch 2 times, most recently from 9da8f79 to e6c5efa Compare June 12, 2024 16:40
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think the last thing is just, can we add a test suite that exercises these new methods for each vector?

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Jun 13, 2024
@normanj-bitquill
Copy link
Contributor Author

@lidavidm I have added unit tests for each vector that implements ValueIterableVector.

Thanks. I think the last thing is just, can we add a test suite that exercises these new methods for each vector?

* The new interface indicates that a ValueVector is iterable
* Contains default methods for getting an Iterator and Iterable
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Jun 18, 2024
@lidavidm lidavidm merged commit 4413110 into apache:main Jun 18, 2024
16 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Jun 18, 2024
Copy link

After merging your PR, Conbench analyzed the 8 benchmarking runs that have been run so far on merge-commit 4413110.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 119 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants