We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q_rsqrt
Here is the Q_rsqrt() code:
Q_rsqrt()
inline float Q_rsqrt( float number ) { float x = 0.5f * number; float y; // compute approximate inverse square root #if defined(DAEMON_USE_ARCH_INTRINSICS_i686_sse) // SSE rsqrt relative error bound: 3.7 * 10^-4 _mm_store_ss( &y, _mm_rsqrt_ss( _mm_load_ss( &number ) ) ); #else y = Util::bit_cast<float>( 0x5f3759df - ( Util::bit_cast<uint32_t>( number ) >> 1 ) ); y *= ( 1.5f - ( x * y * y ) ); // initial iteration // relative error bound after the initial iteration: 1.8 * 10^-3 #endif y *= ( 1.5f - ( x * y * y ) ); // second iteration for higher precision return y; }
If I comment out the second iteration, this way;
inline float Q_rsqrt( float number ) { float x = 0.5f * number; float y; // compute approximate inverse square root #if defined(DAEMON_USE_ARCH_INTRINSICS_i686_sse) // SSE rsqrt relative error bound: 3.7 * 10^-4 _mm_store_ss( &y, _mm_rsqrt_ss( _mm_load_ss( &number ) ) ); #else y = Util::bit_cast<float>( 0x5f3759df - ( Util::bit_cast<uint32_t>( number ) >> 1 ) ); y *= ( 1.5f - ( x * y * y ) ); // initial iteration // relative error bound after the initial iteration: 1.8 * 10^-3 #endif // y *= ( 1.5f - ( x * y * y ) ); // second iteration for higher precision return y; }
I jump from 8fps to 10fps (+25%) with r_VBOmodel 0 using the branch and the test layout (177 visible models) from:
r_VBOmodel 0
and I see no visual difference.
The text was updated successfully, but these errors were encountered:
Here is the code in ioq3:
float Q_rsqrt( float number ) { floatint_t t; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; t.f = number; t.i = 0x5f3759df - ( t.i >> 1 ); // what the fuck? y = t.f; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration // y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed return y; }
The second iteration is not used and it is said it can be removed.
So, I doubt we would break compatibility with Quake 3 by removing the second iteration.
Also, I wonder if it's a mistake if that second iteration is done even when not using the tricky reverse but the SSE code.
Sorry, something went wrong.
No branches or pull requests
Here is the
Q_rsqrt()
code:If I comment out the second iteration, this way;
I jump from 8fps to 10fps (+25%) with
r_VBOmodel 0
using the branch and the test layout (177 visible models) from:and I see no visual difference.
The text was updated successfully, but these errors were encountered: