Reduce runtime of go Encode() implementation by 20% #566
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a performance optimization change for Encode().
As-is, Encode() always calls a stringconcat (by "+"-ing strings together). This call is unnecessary as we know the structure of the location codes ahead of time, and non-performant as stringconcat is slow compared to slicebytetostring.
To do this, I made the initial byte slice allocated represent the entire final code and immediately fill in the separator. This means later, when we format the code, we don't need to concatenate the Separator in as it is already there.
I introduce a special case for the location pair after the separator rather than having those bytes computed in the loop: otherwise we need to have an if-statement within the loop. As a bonus, having these computed first let us remove the two unnecessary divides that formerly happened at the end of the final loop. (I am unsure if this last point is noticeable - I could not detect a significant difference.)
I haven't bothered with optimizing the case for codes which need padding before the separator (or which do not use the full 15 non-separator characters) - given that the benchmark does not deal with such cases, I assumed such cases are far less used.
Performance difference on my machine:
Before:
After:
I've attached cpu profiles of before/after as well.