Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made enhancements to BitCastScalar and fixed F16 compare operators #1884

Merged
merged 3 commits into from
Dec 16, 2023

Conversation

johnplatts
Copy link
Contributor

Updated BitCastScalar to allow constexpr bit casts from hwy::float16_t and hwy::bfloat16_t if the __builtin_bit_cast intrinsic is available and To is either (a) a floating-point type (including hwy::float16_t and hwy::bfloat16_t), (b) an integer type, or (c) a class or struct type that does not contain any union, pointer, reference, or volatile-qualified subobjects.

Also updated BitCastScalar to allow constexpr bit casts to hwy::float16_t or hwy::bfloat16_t from integer types, floating-point types, or class/struct types that do not contain any union, pointer, reference, or volatile-qualified subobjects if the __builtin_bit_cast intrinsic is available.

Also added a hwy::float16_t::FromBits static member function that bit casts a uint16_t to hwy::float16_t, and also added a hwy::bfloat16_t::FromBits static member function that bit casts a uint16_t to hwy::bfloat16_t.

hwy::float16_t::FromBits and hwy::bfloat16_t::FromBits are both guaranteed to be constexpr if the __builtin_bit_cast intrinsic is available.

Also updated F16FromF32 to be constexpr if either (a) HWY_HAVE_SCALAR_F16_OPERATORS is 1 or (b) the __builtin_bit_cast intrinsic is available and C++14/C++17/C++20/C++23 mode is enabled.

Also replaced HWY_DASSERT with HWY_F16_FROM_F32_DASSERT, which only does a HWY_DASSERT if F16FromF32 is known to be called from a non-constant-evaluated context or the __builtin_bit_cast intrinsic is not available to avoid compilation errors if F16FromF32 is called from a constant-evaluated context.

Also updated F16FromF32, BF16FromF32, F16/BF16 LowestValue, F16/BF16 HighestValue, F16/BF16 Epsilon, and F16/BF16 MantissaEnd to use hwy::float16::FromBits or hwy::bfloat16_t::FromBits instead of BitCastScalar.

Also fixed compilation errors with the hwy::float16_t comparison operators.

jan-wassenberg
jan-wassenberg previously approved these changes Dec 1, 2023
Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you making for these fixes+improvements :)

hwy/base.h Outdated
private:
struct F16FromU16BitsTag {};
constexpr float16_t(F16FromU16BitsTag /*tag*/, uint16_t u16_bits)
: bits(u16_bits){};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't see this. CI reports this extra ; as a build error. Would you please remove it?

@jan-wassenberg
Copy link
Member

Thanks for making the fix. FYI we have a logjam at the moment: tests started failing after a compiler update. I am investigating, it may be caused by UB in vqsort.

@jan-wassenberg
Copy link
Member

Thanks for updating. FYI the UB is fixed and we have worked around the remaining compiler error, which is being investigated. Builds are again passing.

@jan-wassenberg
Copy link
Member

Unfortunately this patch is failing to build on the Windows Clang config:

lld-link: error: undefined symbol: __truncsfbf2
>>> referenced by /[proc/self/cwd/./third_party/highway/hwy/ops/emu128-inl.h:1814]
>>>               blaze-out/win_x64-fastbuild/bin/third_party/highway/_objs/combine_test/combine_test.obj:(struct hwy::N_EMU128::Vec128<struct hwy::bfloat16_t, 4> __cdecl hwy::N_EMU128::DemoteTo<struct hwy::N_EMU128::Simd<struct hwy::bfloat16_t, 8, -1>, 0, 4>(struct hwy::N_EMU128::Simd<struct hwy::bfloat16_t, 8, -1>, struct hwy::N_EMU128::Vec128<float, 4>))

It might be that our toolchain is mixing up two versions of clang and compiler-rt (LLVM issue, our FAQ). Unfortunately it's difficult to change that in our build system.

Like we do for F16, would you mind disabling the BF16 type for HWY_COMPILER_CLANGCL?

@copybara-service copybara-service bot merged commit d51e34d into google:master Dec 16, 2023
4 of 24 checks passed
Mousius added a commit to Mousius/numpy that referenced this pull request Dec 21, 2023
Mousius added a commit to Mousius/numpy that referenced this pull request Dec 21, 2023
This should include google/highway#1884 which
may fix numpy#25445

[wheel build]
Mousius added a commit to Mousius/numpy that referenced this pull request Dec 21, 2023
rgommers pushed a commit to Mousius/numpy that referenced this pull request Dec 21, 2023
This should include google/highway#1884 which
may fix numpygh-25445.

[skip azp] [skip actions] [skip circle]
@johnplatts johnplatts deleted the jep_hwy_f16_fixes_113023 branch May 1, 2024 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants