Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

types: intern AccAddress.String() to gut wasted expensive recomputations #8694

Merged

Conversation

odeke-em
Copy link
Collaborator

Given that AccAddress is a []byte type, and its .String() method is
quite expensive, continuously invoking the method doesn't easily provide
a way for the result to be memoized. In memory profiles from
benchmarking OneBankSendTxPerBlock and run for a while, it showed >2GiB burnt
and SendCoins consuming a bunch of memory: >2GiB.

This change introduces a simple global cache with a map to intern
AccAddress values, so that we quickly look up the expensively computed
values. With it, the prior memory profiles are erased, and benchmarks
show improvements:

$ benchstat before.txt after.txt
name                     old time/op    new time/op    delta
OneBankSendTxPerBlock-8    1.94ms ± 9%    1.92ms ±11%   -1.34%  (p=0.011 n=58+57)

name                     old alloc/op   new alloc/op   delta
OneBankSendTxPerBlock-8     428kB ± 0%     365kB ± 0%  -14.67%  (p=0.000 n=58+54)

name                     old allocs/op  new allocs/op  delta
OneBankSendTxPerBlock-8     6.34k ± 0%     5.90k ± 0%   -7.06%  (p=0.000 n=58+57)

Fixes #8693


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

Given that AccAddress is a []byte type, and its .String() method is
quite expensive, continuously invoking the method doesn't easily provide
a way for the result to be memoized. In memory profiles from
benchmarking OneBankSendTxPerBlock and run for a while, it showed >2GiB burnt
and SendCoins consuming a bunch of memory: >2GiB.

This change introduces a simple global cache with a map to intern
AccAddress values, so that we quickly look up the expensively computed
values. With it, the prior memory profiles are erased, and benchmarks
show improvements:

```shell
$ benchstat before.txt after.txt
name                     old time/op    new time/op    delta
OneBankSendTxPerBlock-8    1.94ms ± 9%    1.92ms ±11%   -1.34%  (p=0.011 n=58+57)

name                     old alloc/op   new alloc/op   delta
OneBankSendTxPerBlock-8     428kB ± 0%     365kB ± 0%  -14.67%  (p=0.000 n=58+54)

name                     old allocs/op  new allocs/op  delta
OneBankSendTxPerBlock-8     6.34k ± 0%     5.90k ± 0%   -7.06%  (p=0.000 n=58+57)
```

Fixes #8693
@codecov
Copy link

codecov bot commented Feb 25, 2021

Codecov Report

Merging #8694 (57a217e) into master (ba74a7c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8694   +/-   ##
=======================================
  Coverage   61.37%   61.37%           
=======================================
  Files         670      670           
  Lines       38279    38286    +7     
=======================================
+ Hits        23492    23499    +7     
  Misses      12329    12329           
  Partials     2458     2458           
Impacted Files Coverage Δ
types/address.go 66.92% <100.00%> (+0.93%) ⬆️

@odeke-em
Copy link
Collaborator Author

odeke-em commented Feb 25, 2021 via email

// because the Go compiler recognizes the special case of map[string([]byte)] when used exactly
// in that pattern. See https://github.com/cosmos/cosmos-sdk/issues/8693.
var addMu sync.Mutex
var addrStrMemoize = make(map[string]string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be some sort of eviction of the memoized addresses

@odeke-em
Copy link
Collaborator Author

odeke-em commented Feb 25, 2021 via email

@odeke-em odeke-em merged commit 929b62c into master Feb 25, 2021
@odeke-em odeke-em deleted the types-intern-AccAddress.String-to-fix-wasted-computations branch February 25, 2021 11:01
@robert-zaremba
Copy link
Collaborator

I would like to second on @marbar3778 comment - introducing a global cache with mutex may lead to memory problems.
I don't think this should be committed without eviction mechanism. Here, the memory can growth substantially.
Thoughts?
cc: @AmauryM , @aaronc

Also, for small optimization - we could move mutex is only when we set a value. (here the entry is updated at most once).

@odeke-em
Copy link
Collaborator Author

odeke-em commented Feb 25, 2021 via email

@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Feb 25, 2021

Totally agree for iterative approach. However I'm not convinced that this is change - we are doing a microbenchmark without looking at the bigger impact this can cause (not managable memory growth).

@robert-zaremba
Copy link
Collaborator

To be more specific: IMHO micro-optimization of 0.02 ms without more data (both how it impacts the overall performance and storage growth) is not worth taking a risk of potential memory leak.

@amaury1093
Copy link
Contributor

I agree with @robert-zaremba here. Before making this change, I'd like to see some profiling on how this map grows on e.g. Gaia for a couple of days. If it grows in an unexpected manner, I prefer to revert this change, have slightly sub-optimized code, but not introducing memory leaks on live chains.

@odeke-em
Copy link
Collaborator Author

odeke-em commented Feb 25, 2021 via email

@robert-zaremba robert-zaremba mentioned this pull request Feb 26, 2021
9 tasks
@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Feb 26, 2021

it clearly is RAM taking down from >2GiB down to a lower value that disappear

RAM optimization is definitely interesting. @odeke-em , how did you benchmark this? In #8717 I changed map to a lru structure and would be interesting to repeat some flow to find a good cache size parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants