GH-44767: [C++] Fix Float16.To{Little,Big}Endian on big endian machines #44768

QuLogic · 2024-11-18T09:32:43Z

Rationale for this change

See issue.

What changes are included in this PR?

For ToLittleEndian/ToBigEndian, the result should always be in the specified endianness, not depend on host order.

In the test, instead of casting the uint8_t data into a uint16_t (with unspecified endianness handling), compare the bytes directly in their expected orders.

Are these changes tested?

Tested on little-endian, still building for big-endian.

Are there any user-facing changes?

Fixes #44767

GitHub Issue: [C++] Float16.{ToBytes,FromBytes} fail on big-endian machines #44767

For `ToLittleEndian`/`ToBigEndian`, the result should always be in the specified endianness, not depend on host order. In the test ,istead of casting the `uint8_t` data into a `uint16_t` (with unspecified endianness handling), compare the bytes directly in their expected orders.

github-actions · 2024-11-18T09:33:10Z

⚠️ GitHub issue #44767 has been automatically assigned in GitHub to PR creator.

QuLogic · 2024-11-18T09:43:32Z

Confirmed to pass on a big-endian machine as well.

pitrou

Thanks a lot @QuLogic . For the record, did you try running the entire test suite on a big-endian machine?

conbench-apache-arrow · 2024-11-18T22:08:50Z

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 59decc3.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

QuLogic · 2024-11-19T08:38:50Z

For the record, did you try running the entire test suite on a big-endian machine?

Yes, I've run them all, but there are still several failures to go:

90% tests passed, 9 tests failed out of 89

The following tests FAILED:
	 29 - arrow-compute-aggregate-test (Failed)
	 68 - arrow-dataset-file-parquet-test (Failed)
	 73 - arrow-flight-test (Failed)
	 76 - arrow-ipc-read-write-test (Failed)
	 81 - parquet-internals-test (Failed)
	 82 - parquet-reader-test (Failed)
	 84 - parquet-arrow-test (Failed)
	 85 - parquet-arrow-internals-test (Failed)
	 87 - parquet-encryption-key-management-test (Failed)

pitrou · 2024-11-19T08:48:51Z

@QuLogic Thanks. Parquet failures are expected unfortunately, as the work of making Parquet C++ big endian-compatible has not been done.

The IPC and compute failures are a bit more worrying.

For the record, are you working for a company that provides big endian systems?

QuLogic · 2024-11-19T09:07:21Z

Oops, I forgot to set the test environment variables; there's no arrow-flight-test failure:

91% tests passed, 8 tests failed out of 89

The following tests FAILED:
	 29 - arrow-compute-aggregate-test (Failed)
	 68 - arrow-dataset-file-parquet-test (Failed)
	 69 - arrow-dataset-file-parquet-encryption-test (Failed)
	 81 - parquet-internals-test (Failed)
	 82 - parquet-reader-test (Failed)
	 84 - parquet-arrow-test (Failed)
	 85 - parquet-arrow-internals-test (Failed)
	 87 - parquet-encryption-key-management-test (Failed)

For the record, are you working for a company that provides big endian systems?

No, I'm trying to fix geopandas on Fedora s390x.

pitrou · 2024-11-19T09:09:35Z

Could you run arrow-compute-aggregate-test in verbose mode and open a new issue with the failures perhaps?

QuLogic · 2024-11-19T11:17:57Z

Ah, those are #12681 (comment)

pitrou · 2024-11-19T12:30:32Z

Ah, thanks. These are minor precision issues in the test, it seems.

github-actions bot added Component: C++ awaiting review Awaiting review labels Nov 18, 2024

QuLogic marked this pull request as ready for review November 18, 2024 09:43

mapleFU approved these changes Nov 18, 2024

View reviewed changes

github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Nov 18, 2024

pitrou approved these changes Nov 18, 2024

View reviewed changes

pitrou merged commit 59decc3 into apache:main Nov 18, 2024
40 of 41 checks passed

pitrou removed the awaiting committer review Awaiting committer review label Nov 18, 2024

pitrou mentioned this pull request Nov 18, 2024

[C++] Float16.{ToBytes,FromBytes} fail on big-endian machines #44767

Closed

QuLogic deleted the big-endian-float16 branch November 18, 2024 22:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-44767: [C++] Fix Float16.To{Little,Big}Endian on big endian machines #44768

GH-44767: [C++] Fix Float16.To{Little,Big}Endian on big endian machines #44768

QuLogic commented Nov 18, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Nov 18, 2024

QuLogic commented Nov 18, 2024

pitrou left a comment

conbench-apache-arrow bot commented Nov 18, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

GH-44767: [C++] Fix Float16.To{Little,Big}Endian on big endian machines #44768

GH-44767: [C++] Fix Float16.To{Little,Big}Endian on big endian machines #44768

Conversation

QuLogic commented Nov 18, 2024 • edited by github-actions bot Loading

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

github-actions bot commented Nov 18, 2024

QuLogic commented Nov 18, 2024

pitrou left a comment

Choose a reason for hiding this comment

conbench-apache-arrow bot commented Nov 18, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

QuLogic commented Nov 19, 2024

pitrou commented Nov 19, 2024

QuLogic commented Nov 18, 2024 •

edited by github-actions bot

Loading