Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
go/mysql: performance optimizations in protocol encoding
This employs a couple tricks that combined seemed fruitful: * Swapping to binary.LittleEndian.Put* on the basic calls gets us a free boost while removing code. The main win from this swap is the slice boundary check, resulting in a massive boost. I kept it inlined, but added my own boundary checking in `writeLenEncInt` since swapping it out here resulted in a very minor performance regression from the current results. I assume from the extra coersion needed to the uint* type, and another reslice. * Reslicing the byte slice early so all future operations work on 0-index rather than pos+ indexing. This seemed to be a pretty sizeable win without needing to do more addition on every operation later to determine the index, they get swapped out for constants. * Read path employs the same early reslicing, but already has explicit bounds checks. * Rewrite and specialize writeZeroes for the known constants in the MySQL protocol, as well as a more generic algorithm that works in chunks of 4 bytes. One interesting observation from `writeZeroes`, the specialized versions get highly optimized, I assume, because of no branching necessary at all. The inlined zerofill can be highly optimized by the compiler. See: https://godbolt.org/z/E68heoddc ``` $ benchstat {old,new}.txt goos: darwin goarch: arm64 pkg: vitess.io/vitess/go/mysql │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ EncWriteInt/16-bit-10 0.4685n ± 0% 0.3604n ± 0% -23.07% (p=0.000 n=10) EncWriteInt/16-bit-lenencoded-10 2.049n ± 0% 2.096n ± 0% +2.32% (p=0.000 n=10) EncWriteInt/24-bit-lenencoded-10 1.987n ± 0% 2.099n ± 0% +5.66% (p=0.000 n=10) EncWriteInt/32-bit-10 0.7819n ± 0% 0.3994n ± 3% -48.91% (p=0.000 n=10) EncWriteInt/64-bit-10 1.4080n ± 0% 0.5075n ± 1% -63.95% (p=0.000 n=10) EncWriteInt/64-bit-lenencoded-10 3.126n ± 0% 2.219n ± 1% -29.03% (p=0.000 n=10) EncWriteZeroes/4-bytes-10 2.5030n ± 0% 0.5842n ± 2% -76.66% (p=0.000 n=10) EncWriteZeroes/10-bytes-10 4.3815n ± 0% 0.6735n ± 1% -84.63% (p=0.000 n=10) EncWriteZeroes/23-bytes-10 8.458n ± 0% 2.157n ± 6% -74.50% (p=0.000 n=10) EncWriteZeroes/55-bytes-10 20.88n ± 10% 12.31n ± 1% -41.03% (p=0.000 n=10) EncReadInt/16-bit-10 2.050n ± 0% 2.182n ± 1% +6.44% (p=0.000 n=10) EncReadInt/24-bit-10 2.034n ± 0% 2.066n ± 5% +1.55% (p=0.000 n=10) EncReadInt/64-bit-10 2.819n ± 1% 2.194n ± 0% -22.16% geomean 2.500n 1.392n -44.33% ```
- Loading branch information