Skip to content

x86_64 MXCSR denormals are zero bit: add constant #852

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
b-jonas0 opened this issue May 2, 2020 · 4 comments
Open

x86_64 MXCSR denormals are zero bit: add constant #852

b-jonas0 opened this issue May 2, 2020 · 4 comments

Comments

@b-jonas0
Copy link

b-jonas0 commented May 2, 2020

On x86_64, the SSE floating point control register (MXCSR) has two bits concerned with denormal floating point values. The purpose of these bits is to enable a mode that avoids slowdowns from calculations with denormal numbers at the cost of getting incorrect results from underflows.

The more important one is the flush to zero bit (bit 15 in the register). When that bit is set, when a floating-point arithmetic instruction would output a denormal number (and would not raise an unmasked exception), then it instead outputs zero (and the denormal exception that it would normally flag is suppressed). The consts for this bit in the core::arch::x86_64 module are _MM_FLUSH_ZERO_ON, _MM_FLUSH_ZERO_OFF, _MM_FLUSH_ZERO_MASK.

The less important bit is the denormals are zero bit (bit 6 in the register). That bit affects the input arguments of floating-point arithmetic instructions, rather than the outputs. When the bit is set, when a floating-point arithmetic instruction has a number in a source argument that is a denormal number, the instruction behaves as if that number was zero instead. On x86_32, setting this bit is only conditionally supported, because old CPUs didn't have this mode. Testing and clearing the bit is always supported if the MXCSR register exists.

The crate does not have consts for the denormals are zero bit. This is probably an oversight, and this ticket asks to correct it. I suggest the following names, based on Intel's C interface, but I don't insist on them.

pub const _MM_DENORMALS_ZERO_MASK: u32 = 0x0040;
pub const _MM_DENORMALS_ZERO_ON: u32 = 0x0040;
pub const _MM_DENORMALS_ZERO_OFF: u32 = 0x0000;

(Please double-check the above values before commiting.)

The C interface also has convenience macros for getting and setting the bit, so you may add those too. Personally I think they're superfluous, because functions that access this bit will most likely set or clear the bits together with the flush to zero bits, eg.

_mm_setcsr(_mm_getcsr() | _MM_FLUSH_ZERO_ON | _MM_DENORMALS_ZERO_ON);
// XMM floating-point arithmetic computations here
_mm_setcsr(_mm_getcsr() & !_MM_FLUSH_ZERO_MASK & !_MM_DENORMALS_ZERO_MASK);
@Amanieu
Copy link
Member

Amanieu commented May 2, 2020

I'm happy to accept a PR adding all of these. A good starting point would be here where all the other _MM_* constants and helper functions are defined. You'll also need to update the documentation on _mm_setcsr.

@bjorn3
Copy link
Member

bjorn3 commented May 2, 2020

Doesn't LLVM assume that is works with the default floating point environment?

@matthiascy
Copy link

What is the current status? Couldn't find DAZ flag in the code.

@Amanieu
Copy link
Member

Amanieu commented May 24, 2022

The current status is still:

I'm happy to accept a PR adding all of these. A good starting point would be here where all the other _MM_* constants and helper functions are defined. You'll also need to update the documentation on _mm_setcsr.

matthiascy added a commit to matthiascy/stdarch that referenced this issue Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants