-
Notifications
You must be signed in to change notification settings - Fork 610
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reapply: Implement optimized 64-bit varint decoding functions.
Summary: This fixes the issue introduced in the previous iteration of this diff (_mm_extract_... functions live in intrin.h in MSVC, not immintrin.h). Otherwise, it's unchanged. Original commit message: Implements optimized decoding of 64-bit varints using the BMI2 pext instruction (when available) or a shift/shuffle/mask SSE strategy when not. Microbenchmarks show that the new versions are substantially faster for small (1-2 byte encodings) or large (6-10 byte encodings) varints, while being similar for the middle range. This is the sort of performance profile we want given the emprical distributions of varints; they tend to be heavily biased towards the very small or very large parts of their range. I think the microbenchmarks likely under-sell the improvement, though. The current (unrolled) implementation relies on lots of branching, and benefits from a microbenchmark setup where it can hog the entire branch predictor. The new variants only branch on the "is small" and "is overflow" checks, and are otherwise straightline code. Reviewed By: vitaut Differential Revision: D38510907 fbshipit-source-id: 451cbd2da6634fd826f9721444a2dd4141c2afac
- Loading branch information
1 parent
77649c5
commit 110807c
Showing
3 changed files
with
442 additions
and
81 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.