Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements #32

Closed
jbuckmccready opened this issue Apr 1, 2020 · 3 comments
Closed

Performance Improvements #32

jbuckmccready opened this issue Apr 1, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@jbuckmccready
Copy link
Collaborator

Related to investigating the radix sort mentioned in #30 I found a few other ways to improve performance in my c++ usage.

In the quicksort function when the sub array is small enough (I found 8 to be a good size) switch to insertion sort, this resulted in about a 3% improvement when indexing.

// still stop sorting if within same node
if (left / NodeSize >= right / NodeSize) {
  return;
}

if (right - left < 8) {
  // faster than continuing with the quick sort
  insertionSort(values, boxes, indices, left, right);
  return;
}

When searching/querying the index I allow an existing container to be passed in to be used for both the results and the stack for traversal. This was a huge performance boost (20-30%) since I often perform the search in tight loops which would involve many allocation/deallocations for both the results and stack for traversal.

@mourner
Copy link
Owner

mourner commented Apr 1, 2020

Thanks for the suggestions!

  • I tried insertionSort on master and it didn't seem to show any measurable improvement (although JS benchmarks are pretty volatile in general). Probably not worth adding.
  • For bbox search, we can try preallocating the traversal stack once for the index and not doing any pops/pushes since its size is bounded by (nodeSize - 1) * treeHeight, which is pretty small. I vaguely recall trying something like that in the past but maybe I should look into it again.
  • Providing an existing results array might be a good idea, although the benefit should be considered against complicating the API (in JS avoiding this array creation might not be a big hit).

@mourner
Copy link
Owner

mourner commented Apr 1, 2020

@jbuckmccready pushed the second idea (preallocating stack) to https://github.com/mourner/flatbush/tree/preallocate-stack but it doesn't seem to show any perf improvements. 🤷‍♂

@jbuckmccready
Copy link
Collaborator Author

  • Yeah the insertion sort benefit is small but definitely measurable for the c++ code.
  • I decided not to store a member variable to the traversal stack to keep the querying thread safe after indexing, but it does make the API kind of messy having an overload that takes a std::vector by reference.
  • I changed the filter function to act as a visitor to populate existing results arrays (it both returns true/false and builds the results by using a captured reference), one possibility is to add a visitQuery or visitSearch method that does not return an array and allows a passed in function to do the work - I don't think any of this will be very meaningful for performance in Javascript and will somewhat clutter the API.

For the preallocated stack I think for many JIT/garbage collected languages it's not an issue because the buffer gets reused/lifted out of the loop, or the garbage collector is just very optimized for the repeated small block allocations. It may be more impactful if there are many other allocations in between the search calls to add pressure to the garbage collector, but I doubt it will be as significant as in a non-garbage collected language.

Sorry nothing I found was of much use! 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants