Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code optimization and cleanup #1

Open
technosaurus opened this issue Feb 28, 2015 · 1 comment
Open

Code optimization and cleanup #1

technosaurus opened this issue Feb 28, 2015 · 1 comment
Assignees
Milestone

Comments

@technosaurus
Copy link
Owner

Lots of room for improvement.

  1. Floating point constants should be 0.0f (float) vs. 0.0 (double) for faster float ops
  2. slow math ops like sin, cos & pow should be offloaded to lookup tables where possible
    a. ) 1 version with init code to reduce binary size at the cost of startup time
    b. ) another version with static const lookup tables for faster startup at the cost of size
    c. ) some areas just need the math simplified for easier calculation
    multiply by precalculated 1/float is faster than divide by float
    some things need ops rearranged so constants can be merged and separated from variables
  3. unwind some loops into return/initialization (less memcpy lookalikes)
  4. functions should take pointers instead of using globals and some_func(void)
@technosaurus technosaurus self-assigned this Feb 28, 2015
@technosaurus technosaurus added this to the 1.0 milestone Feb 28, 2015
@technosaurus
Copy link
Owner Author

static inline float Requantize_Pow_43(unsigned x) returns x^(4/3)
This could be a simplified to 16(x/8)^4/3
or 256(x/64)^4/3
Which means the lookup table could be reduced in size.
However pow(x,4.0f/3.0f) ==> cbrt((x_x)_(x*x)); to reduce the time by ~half; however,
these can be combined using a variation of the fast inverse square problem:

/* Description: returns x^(4/3)
 * same as cbrt((x*x)*(x*x)), but optimized for the limited cases we handle (integers 0-8209)
 */
static inline float pow43opt2(float x) {
  if (x<2) return x;
  else x*=x,x*=x; //pow(x,4)
  float a3,x2=x+x;
  union {float f; unsigned i;} u = {x};
  u.i = u.i/3 + 0x2a517d3c; //~cbrt(x)
  int accuracy_iterations=2;  //reduce for speed, increase for precision
  while (accuracy_iterations--){ //Lancaster iterations
    a3=u.f*u.f*u.f;
    u.f *= (a3 + x2) / (a3 + a3 + x);
  }
  return u.f;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant