-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
types: intern AccAddress.String() to gut wasted expensive recomputations #8694
types: intern AccAddress.String() to gut wasted expensive recomputations #8694
Conversation
Given that AccAddress is a []byte type, and its .String() method is quite expensive, continuously invoking the method doesn't easily provide a way for the result to be memoized. In memory profiles from benchmarking OneBankSendTxPerBlock and run for a while, it showed >2GiB burnt and SendCoins consuming a bunch of memory: >2GiB. This change introduces a simple global cache with a map to intern AccAddress values, so that we quickly look up the expensively computed values. With it, the prior memory profiles are erased, and benchmarks show improvements: ```shell $ benchstat before.txt after.txt name old time/op new time/op delta OneBankSendTxPerBlock-8 1.94ms ± 9% 1.92ms ±11% -1.34% (p=0.011 n=58+57) name old alloc/op new alloc/op delta OneBankSendTxPerBlock-8 428kB ± 0% 365kB ± 0% -14.67% (p=0.000 n=58+54) name old allocs/op new allocs/op delta OneBankSendTxPerBlock-8 6.34k ± 0% 5.90k ± 0% -7.06% (p=0.000 n=58+57) ``` Fixes #8693
Codecov Report
@@ Coverage Diff @@
## master #8694 +/- ##
=======================================
Coverage 61.37% 61.37%
=======================================
Files 670 670
Lines 38279 38286 +7
=======================================
+ Hits 23492 23499 +7
Misses 12329 12329
Partials 2458 2458
|
It caches that result so that we’ll never have to even invoke .Empty().
Think of this change as basically running the code as it was before, but
this time under a cache to avoid recomputations.
…On Wed, Feb 24, 2021 at 4:24 PM Alessio Treglia ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In types/address.go
<#8694 (comment)>:
> // String implements the Stringer interface.
-func (aa AccAddress) String() string {
+func (aa AccAddress) String() (str string) {
+ addMu.Lock()
+ defer addMu.Unlock()
+
+ if str, ok := addrStrMemoize[string(aa)]; ok {
+ return str
+ }
+
+ // Otherwise cache it for later memoization.
+ defer func() {
+ addrStrMemoize[string(aa)] = str
+ }()
+
Shouldn't the following block come always first:
if aa.Empty() {
return ""
}
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8694 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V3HDPUREWRWO2L4WXLTAWKCVANCNFSM4YFPGATA>
.
|
// because the Go compiler recognizes the special case of map[string([]byte)] when used exactly | ||
// in that pattern. See https://github.com/cosmos/cosmos-sdk/issues/8693. | ||
var addMu sync.Mutex | ||
var addrStrMemoize = make(map[string]string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be some sort of eviction of the memoized addresses
That can come later on, for now this suffices. The eviction mechanism would
be perhaps LRU based with a memory bound. Progressively we can get to the
final solution if need be :-) I plan on running nodes under live traffic
with continuous profiling so we’ll figure out what needs to be fixed.
…On Thu, Feb 25, 2021 at 1:16 AM Marko ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In types/address.go
<#8694 (comment)>:
> @@ -236,8 +237,28 @@ func (aa AccAddress) Bytes() []byte {
return aa
}
+// AccAddress.String() is expensive and if unoptimized dominantly showed up in profiles,
+// yet has no mechanisms to trivially cache the result given that AccAddress is a []byte type.
+// With the string interning below, we are able to achieve zero cost allocations for string->[]byte
+// because the Go compiler recognizes the special case of map[string([]byte)] when used exactly
+// in that pattern. See #8693.
+var addMu sync.Mutex
+var addrStrMemoize = make(map[string]string)
Should there be some sort of eviction of the memoized addresses
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8694 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V2IWP4Y4PZPBLR43QLTAYIOXANCNFSM4YFPGATA>
.
|
I would like to second on @marbar3778 comment - introducing a global cache with mutex may lead to memory problems. Also, for small optimization - we could move mutex is only when we set a value. (here the entry is updated at most once). |
Robert, I think a fully fledged mechanism without progressively
benchmarking is a premature approach that’ll burn time and resources. We
start by eliminating big problems, keep profiling away the top issues, and
with live traffic keep isolating. I mentioned that am progressively working
on more changes but this suffices for now.
…On Thu, Feb 25, 2021 at 3:32 AM Robert Zaremba ***@***.***> wrote:
I would like to second on @marbar3778 <https://github.com/marbar3778>
comment - introducing a global cache with mutex may lead to memory problems.
I don't think this should be committed without eviction mechanism. Here,
the memory can growth substantially.
Thoughts?
cc: @AmauryM <https://github.com/amaurym> , @aaronc
<https://github.com/aaronc>
Also, for small optimization - we could move mutex is only when we set a
value. (here the entry is updated at most once).
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8694 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V5TVV4GVQY4YGLZZKLTAYYLZANCNFSM4YFPGATA>
.
|
Totally agree for iterative approach. However I'm not convinced that this is change - we are doing a microbenchmark without looking at the bigger impact this can cause (not managable memory growth). |
To be more specific: IMHO micro-optimization of |
I agree with @robert-zaremba here. Before making this change, I'd like to see some profiling on how this map grows on e.g. Gaia for a couple of days. If it grows in an unexpected manner, I prefer to revert this change, have slightly sub-optimized code, but not introducing memory leaks on live chains. |
Am definitely fine with a revert but is incorrect to call this a “micro
optimization” as clearly this is profile guided changes to pop off
something that was consuming RAM, this isn’t about CPU time, it clearly is
RAM taking down from >2GiB down to a lower value that disappear, and that’s
what profile guided development entails. So again this isn’t a CPU
micro-optimization but a RAM consumption problem as I showed in the issue.
…On Thu, Feb 25, 2021 at 10:27 AM Amaury ***@***.***> wrote:
I agree with @robert-zaremba <https://github.com/robert-zaremba> here.
Before making this change, I'd like to see some profiling on how this map
grows on e.g. Gaia for a couple of days. If it grows in an unexpected
manner, I prefer to revert this change, have slightly sub-optimized code,
but not introducing memory leaks on live chains.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8694 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V5XCXK7UHTR2WCYJ4LTA2JBTANCNFSM4YFPGATA>
.
|
Given that AccAddress is a []byte type, and its .String() method is
quite expensive, continuously invoking the method doesn't easily provide
a way for the result to be memoized. In memory profiles from
benchmarking OneBankSendTxPerBlock and run for a while, it showed >2GiB burnt
and SendCoins consuming a bunch of memory: >2GiB.
This change introduces a simple global cache with a map to intern
AccAddress values, so that we quickly look up the expensively computed
values. With it, the prior memory profiles are erased, and benchmarks
show improvements:
Fixes #8693
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerCodecov Report
in the comment section below once CI passes