This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
Speed up timestamp generation when logging #9933
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR speeds up how we generate timestamps when logging; instead of regenerating them from scratch all the time we now cache part of the timestamp in TLS.
Here's the performance of the original implementation as measured on my machine:
And here's the performance of the new one:
So the overhead was cut down to less than 10%.
Seems like this was pretty cheap in the first place; does this actually matter?
This will not help in normal cases where we're not logging much, but anytime someone turns on a spammy trace log (which does happen on our staking ops' nodes) the costs of logging starts to balloon. Yes, we shouldn't be logging this much. But if we do then it shouldn't bog down the CPU as much as it does.
Even though this might seem like it shouldn't make much difference the issue here is that at this scale we're experiencing a case of a death by a thousand cuts, where even though doing it only once is very cheap we're doing it so much that once you add everything together you end up in a situation where logging can take a whopping 30% of the total CPU time. This is one of the easier and most straightforward improvements we can make, which represented ~6.6% of the total CPU used in this particular profiling run, which now should be cut down to less than ~1%.