Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map: reduce allocations for batch operations #1513

Closed
wants to merge 2 commits into from
Closed

map: reduce allocations for batch operations #1513

wants to merge 2 commits into from

Conversation

florianl
Copy link
Contributor

@florianl florianl commented Jul 13, 2024

Reduce allocations for batch operations by

  • skipping unmarshaling if returned attr.Count is 0
  • allow partial unmarshaling if attr.Count is not expected count

This should help to resolve #1080.

benchstat old.txt new.txt
name                                        old time/op    new time/op    delta
Iterate/Hash/MapIterator-16                   1.15ms ± 4%    1.10ms ± 9%   -4.27%  (p=0.035 n=10+10)
Iterate/Hash/MapIteratorDelete-16             1.71ms ± 3%    1.71ms ± 3%     ~     (p=0.971 n=10+10)
Iterate/Hash/BatchLookup-16                   2.63µs ± 8%    2.53µs ±10%     ~     (p=0.123 n=10+10)
Iterate/Hash/BatchLookupAndDelete-16          85.5µs ± 2%    86.7µs ± 2%   +1.38%  (p=0.011 n=10+10)
Iterate/Hash/BatchDelete-16                   77.7µs ± 3%    78.0µs ± 2%     ~     (p=0.579 n=10+10)
Iterate/PerCPUHash/MapIterator-16             3.12ms ±13%    3.22ms ± 5%     ~     (p=0.515 n=10+8)
Iterate/PerCPUHash/MapIteratorDelete-16       6.32ms ± 8%    6.14ms ±17%     ~     (p=0.579 n=10+10)
Iterate/PerCPUHash/BatchLookup-16             2.11ms ±19%    0.04ms ±19%  -97.92%  (p=0.000 n=10+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16    3.14ms ± 7%    3.15ms ± 8%     ~     (p=0.796 n=10+10)
Iterate/PerCPUHash/BatchDelete-16              373µs ± 3%     382µs ± 4%   +2.37%  (p=0.028 n=9+10)

name                                        old alloc/op   new alloc/op   delta
Iterate/Hash/MapIterator-16                   24.1kB ± 0%    24.1kB ± 0%     ~     (p=0.101 n=10+10)
Iterate/Hash/MapIteratorDelete-16               107B ± 0%      107B ± 0%     ~     (all equal)
Iterate/Hash/BatchLookup-16                     136B ± 0%      136B ± 0%     ~     (all equal)
Iterate/Hash/BatchLookupAndDelete-16            144B ± 0%      144B ± 0%     ~     (all equal)
Iterate/Hash/BatchDelete-16                    24.0B ± 0%     24.0B ± 0%     ~     (all equal)
Iterate/PerCPUHash/MapIterator-16              152kB ± 0%     152kB ± 0%     ~     (p=0.959 n=10+10)
Iterate/PerCPUHash/MapIteratorDelete-16        281kB ± 0%     281kB ± 0%     ~     (p=0.055 n=10+9)
Iterate/PerCPUHash/BatchLookup-16              155kB ± 0%     131kB ± 0%  -15.45%  (p=0.000 n=10+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16     156kB ± 0%     156kB ± 0%     ~     (p=0.734 n=9+10)
Iterate/PerCPUHash/BatchDelete-16              24.0B ± 0%     24.0B ± 0%     ~     (all equal)

name                                        old allocs/op  new allocs/op  delta
Iterate/Hash/MapIterator-16                    1.00k ± 0%     1.00k ± 0%     ~     (all equal)
Iterate/Hash/MapIteratorDelete-16               4.00 ± 0%      4.00 ± 0%     ~     (all equal)
Iterate/Hash/BatchLookup-16                     5.00 ± 0%      5.00 ± 0%     ~     (all equal)
Iterate/Hash/BatchLookupAndDelete-16            5.00 ± 0%      5.00 ± 0%     ~     (all equal)
Iterate/Hash/BatchDelete-16                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Iterate/PerCPUHash/MapIterator-16              2.00k ± 0%     2.00k ± 0%     ~     (all equal)
Iterate/PerCPUHash/MapIteratorDelete-16        3.00k ± 0%     3.00k ± 0%     ~     (all equal)
Iterate/PerCPUHash/BatchLookup-16              1.01k ± 0%     0.01k ± 0%  -99.40%  (p=0.000 n=10+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16     1.01k ± 0%     1.01k ± 0%     ~     (all equal)
Iterate/PerCPUHash/BatchDelete-16               1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Signed-off-by: Florian Lehner <dev@der-flo.net>
Signed-off-by: Florian Lehner <dev@der-flo.net>
@florianl florianl requested a review from a team as a code owner July 13, 2024 16:00
@lmb lmb changed the title map: reduce allocations map: reduce allocations for batch operations Jul 15, 2024
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an impressive speed up!

}

// UnmarshalWithLimit unmarshals the buffer up to limit into the provided value.
func (b Buffer) UnmarshalWithLimit(data any, limit int) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead have an API like:

// Shrink reduces the size of the buffer.
//
// Panics if size is less than 0 or if it is larger than the actual size.
func (b Buffer) Shrink(size int)

Then we can just do Buffer.Shrink(count * size) unconditionally and don't need to check for count == 0 elsewhere.

Not exactly sure what to do for zero copy buffers? Nothing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer.Shrink() sounds like a permanent change to the buffer and I'm not sure about the benefits. It could free some memory some moments earlier (depending on when GC hits in).
The early check count == 0 here doesn't look expensive to me. With Buffer.Shrink(0 * size) we would call sysenc.Unmarshal() with a buf of size 0 - here I would prefer the earlier return and avoid special case handling in sysenc.Unmarshal() if len(buf) ==0.

Copy link
Contributor Author

@florianl florianl Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted the proposed changes and implemented the (Buffer) Shrink(..) option as suggested and compared it against current main (old.txt):

$ benchstat old.txt new.txt 
name                                        old time/op    new time/op    delta
Iterate/Hash/MapIterator-16                   1.15ms ± 4%    1.13ms ± 5%    ~     (p=0.190 n=9+9)
Iterate/Hash/MapIteratorDelete-16             1.73ms ± 2%    1.73ms ± 3%    ~     (p=0.796 n=10+10)
Iterate/Hash/BatchLookup-16                   2.62µs ±10%    2.64µs ±12%    ~     (p=0.853 n=10+10)
Iterate/Hash/BatchLookupAndDelete-16          85.1µs ± 2%    85.9µs ± 2%    ~     (p=0.247 n=10+10)
Iterate/Hash/BatchDelete-16                   77.3µs ± 3%    75.4µs ± 2%  -2.49%  (p=0.004 n=10+10)
Iterate/PerCPUHash/MapIterator-16             3.21ms ±12%    3.22ms ± 7%    ~     (p=0.912 n=10+10)
Iterate/PerCPUHash/MapIteratorDelete-16       6.18ms ±10%    6.23ms ±13%    ~     (p=0.780 n=9+10)
Iterate/PerCPUHash/BatchLookup-16             2.33ms ± 8%    2.17ms ±16%    ~     (p=0.243 n=9+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16    3.18ms ±11%    3.06ms ± 5%    ~     (p=0.218 n=10+10)
Iterate/PerCPUHash/BatchDelete-16              368µs ± 3%     365µs ± 4%    ~     (p=0.529 n=10+10)

name                                        old alloc/op   new alloc/op   delta
Iterate/Hash/MapIterator-16                   24.1kB ± 0%    24.1kB ± 0%    ~     (p=0.306 n=10+10)
Iterate/Hash/MapIteratorDelete-16               107B ± 0%      107B ± 0%    ~     (all equal)
Iterate/Hash/BatchLookup-16                     136B ± 0%      136B ± 0%    ~     (all equal)
Iterate/Hash/BatchLookupAndDelete-16            144B ± 0%      144B ± 0%    ~     (all equal)
Iterate/Hash/BatchDelete-16                    24.0B ± 0%     24.0B ± 0%    ~     (all equal)
Iterate/PerCPUHash/MapIterator-16              152kB ± 0%     152kB ± 0%  -0.00%  (p=0.003 n=9+10)
Iterate/PerCPUHash/MapIteratorDelete-16        281kB ± 0%     281kB ± 0%    ~     (p=0.926 n=10+10)
Iterate/PerCPUHash/BatchLookup-16              155kB ± 0%     155kB ± 0%    ~     (p=0.468 n=10+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16     156kB ± 0%     156kB ± 0%    ~     (p=0.134 n=10+10)
Iterate/PerCPUHash/BatchDelete-16              24.0B ± 0%     24.0B ± 0%    ~     (all equal)

name                                        old allocs/op  new allocs/op  delta
Iterate/Hash/MapIterator-16                    1.00k ± 0%     1.00k ± 0%    ~     (all equal)
Iterate/Hash/MapIteratorDelete-16               4.00 ± 0%      4.00 ± 0%    ~     (all equal)
Iterate/Hash/BatchLookup-16                     5.00 ± 0%      5.00 ± 0%    ~     (all equal)
Iterate/Hash/BatchLookupAndDelete-16            5.00 ± 0%      5.00 ± 0%    ~     (all equal)
Iterate/Hash/BatchDelete-16                     1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Iterate/PerCPUHash/MapIterator-16              2.00k ± 0%     2.00k ± 0%    ~     (all equal)
Iterate/PerCPUHash/MapIteratorDelete-16        3.00k ± 0%     3.00k ± 0%    ~     (all equal)
Iterate/PerCPUHash/BatchLookup-16              1.01k ± 0%     1.01k ± 0%    ~     (all equal)
Iterate/PerCPUHash/BatchLookupAndDelete-16     1.01k ± 0%     1.01k ± 0%    ~     (all equal)
Iterate/PerCPUHash/BatchDelete-16               1.00 ± 0%      1.00 ± 0%    ~     (all equal)

From this I have learned, that the Unmarshal() function in (Buffer) Unmarshal() causes allocations, even if there are 0 elements to unmarshal. Therefore, I suggest to drop the map: allow partial unmarshaling commit and just go with the early return via map: skip unmarshal if attr.Count == 0. WDYT @lmb ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that is some weird interaction between resizing the buffer and

if dataBuf := unsafeBackingMemory(data); len(dataBuf) == len(buf) {
?

Is the special case for count == 0 actually important in practice? I'd imagine we only hit it at the end of a batch iteration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - count == 0 looks like a special case that might be hit at the end of batch operations, depending on the batch size and the number of elements in the map.
For the moment, I think it is best to close this.

@florianl florianl closed this Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

map: don't unmarshal unnecessary items on batch lookup
2 participants