-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect encoding of binary subnormal floats #1049
Comments
Since it’s only 32-bits, it’s entirely possible to do an exhaustive comparison against a trusted floating point implementation. Might be a worthwhile thing to do with all these operations... |
So
|
I'm pretty sure I have a fix. You have to be careful, though. The property you've stated isn't quite correct because NaNs are always round-tripped to a canonical quiet NaN. |
Yeah, I just realized that all of the non-subnormal mismatches are for NaN values; I wasn't sure whether or not those were supposed to be all mapped to the same value. I guess the real property you want to test is whether those two functions are inverses when you compose them the other way around. |
I'm currently running the following before I check in a fix.
So far it's about 10% in, no failures yet. |
some subnormal values were being incorrectly encoded. Fixes #1049
Consider
This, I believe, correctly represents the smallest positive 32-bit subnormal floating-point value. However, if we encode this back into bits, we get something else entirely.
The text was updated successfully, but these errors were encountered: