-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve MurmurHash string hash memory footprint #5028
Improve MurmurHash string hash memory footprint #5028
Conversation
@@ -202,20 +202,20 @@ public static int StringHash(string s) | |||
{ | |||
unchecked | |||
{ | |||
var sChar = s.ToCharArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the biggest offending code. MurmurHash is called once for every DataEnvelope that are going to be sent as a gossip. For a system that are trying to gossip thousands of gossips, this can add up quite significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
Murmur_string_hash | 60.87 ns | 1.240 ns | 2.389 ns | 0.0229 | - | - | 96 B |
Jenkins_string_hash | 221.14 ns | 0.767 ns | 0.680 ns | 0.0153 | - | - | 64 B |
Murmur_binary_hash | 166.72 ns | 2.177 ns | 2.036 ns | - | - | - | - |
Jenkins_binary_hash | 370.40 ns | 0.709 ns | 0.554 ns | - | - | - | - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After
BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19041.985 (2004/May2020Update/20H1)
AMD Ryzen 7 1700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=5.0.203
[Host] : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
DefaultJob : .NET Core 3.1.15 (CoreCLR 4.700.21.21202, CoreFX 4.700.21.21402), X64 RyuJIT
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
Murmur_string_hash | 40.39 ns | 0.082 ns | 0.077 ns | - | - | - | - |
Jenkins_string_hash | 218.54 ns | 0.467 ns | 0.390 ns | 0.0153 | - | - | 64 B |
Murmur_binary_hash | 166.06 ns | 2.224 ns | 2.080 ns | - | - | - | - |
Jenkins_binary_hash | 372.00 ns | 2.132 ns | 1.664 ns | - | - | - | - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is definitely a big improvement.
@@ -336,7 +378,7 @@ public MultiVersionVector(ImmutableDictionary<UniqueAddress, long> nodeVersions) | |||
|
|||
public override bool IsEmpty => Versions.IsEmpty; | |||
public override int Count => Versions.Count; | |||
public override IEnumerator<KeyValuePair<UniqueAddress, long>> VersionEnumerator => Versions.GetEnumerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second memory offender. Calling ImmutableDictionary GetEnumerator seemed to create a shallow copy of some of its internal data structure, consuming memory everytime it is called.
Fix by converting this to a Tuple IEnumerator and our own much simpler IEnumerator implementation.
There is a third memory offender, the serialization byte array allocation, but I have yet found a good way to fix this. |
@Arkatufus can you do a before and after snapshot on this? |
Would a dotMemory screenshot be enough? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about the breaking API change trade-offs?
@@ -40,7 +40,7 @@ public int Compare(Address x, Address y) | |||
if (result != 0) return result; | |||
result = string.CompareOrdinal(x.System, y.System); | |||
if (result != 0) return result; | |||
result = string.CompareOrdinal(x.Host ?? "", y.Host ?? ""); | |||
result = string.CompareOrdinal(x.Host ?? string.Empty, y.Host ?? string.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -257,7 +257,7 @@ public SingleVersionVector(UniqueAddress node, long version) | |||
|
|||
public override bool IsEmpty => false; | |||
public override int Count => 1; | |||
public override IEnumerator<KeyValuePair<UniqueAddress, long>> VersionEnumerator => new Enumerator(Node, Version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a breaking API change or is this a subtype of an internal
type above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, both of these are breaking changes.
I thought about doing this in #4990 - but I decided to avoid the breaking API change by exposing an internal API that should be used for this.
I don't think the risks are that high of this breaking change having a significant negative impact as DData is either primarily used directly or inside of cluster sharding and these APIs aren't called directly... but it's a judgement call.
Well, this is highly annoying. I can't reproduce the bug using the original reproduction code anymore, memory consuption stays between 100-200 Mb no matter what I do. |
Is the frequent gossip still occurring?
…Sent from my iPhone
On May 21, 2021, at 6:34 PM, Gregorius Soedharmo ***@***.***> wrote:
Well, this is highly annoying. I can't reproduce the bug using the original reproduction code anymore, memory consuption stays between 100-200 Mb no matter what I do.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Yes, its still occuring, the console is scrolling endlessly, but the memory total didn't go above 200 Mb |
@Arkatufus let's get this PR cleaned up for merge so we can benefit from the improved memory footprint. We can still work on tracking down the gossip issues in another PR. |
# Conflicts: # src/contrib/cluster/Akka.DistributedData/VersionVector.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closes #5023
Edit by @Aaronontheweb : Removed the reference to #5022 since that has other issues still affecting it