-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Extend Dictionary concurrent access detection to Remove() #18524
Conversation
If I wanted to test, I would use reflection in the test eg to add a loop in entries. |
I could try to add a tests case for CI, what do you think? |
@dotnet-bot test this please |
@MarcoRossignoli test would be great, it is a bit hacky to use reflection but we don't want to regress this as its such an impactful issue.would you mind adding the same test for the existing cases? Should be trivial when you have done one. |
Sure! |
@danmosemsft do you wait tests to merge this? |
I think if you're able to write them now, there is no reason to merge this without it. |
Ok!Working on it! |
Can I ask why the change from _entries to the local entries? I was looking at implementing RemoveAll and found both direct and local access in dictionary and wasn't sure which was preferable. Clearly local is better in this case but why? |
@Wraith2 I understand that it produces better code gen with the current JIT. We only do tricks like that where it's really hot, as it's not a good idea to code around the JIT in general. @AndyAyersMS is this correct, and do you expect the JIT to be able to do this itself in future? |
@MarcoRossignoli understood, and it looks like the fix I'd put into my working version. I just wondered at the choice of change since I'm trying to feel my own way around what I should and shouldn't change. TryInsert for example mixes a local for entries with direct access to buckets while FindEntry has locals for both, I wouldn't want to introduce performance degradations in something as central as Dictionary by not understanding the choice if there was one. It might be that it matters depending on the number of accesses or where they are etc. This seemed like a good place to ask. |
I agree... |
In code like this where access to a class member like ... this.entries ...
... code that makes calls ...
... this.entries ... (probably unmodified from above) the jit is going to assume that the this So generally speaking I would expect caching a frequently accessed member in a local would be a good idea perf wise, though it is very important to measure and be sure. Sometimes the jit has a lot of competition for things that can go in registers and adding one more "long lifetime" resource to the mix can have unexpected consequences. Mixing locally cached and direct member access in one method is confusing (though sometimes necessary if the member value can indeed change). But I'd avoid mixing usage unless it's really needed for correctness, and if it really is necessary, make sure to comment on this in detail. Unfortunately there's no easy way to catch if you mix usages by accident other than careful review of the source and perhaps the generated assembly. |
@AndyAyersMS thank's a lot for explanation! So in this case cache values could improve perf because remove methods does not change entry array or rather when change we set some fields on a particular entry and return(we found entry to "remove"). If i understood well generally speaking coreclr will try to optimize "read-only loop code"...like |
For For some popular combinations of types the code for the methods will be prejitted ahead of time via crossgen. For other types the code will be jitted the first time the method is called. To view the code the jit generates you can do a number of things:
|
@MarcoRossignoli it would be interesting to run the perf tests for dictionary before and after your change. They might also suggest the effect of extracting locals in this way. They are here -- I see only one Remove test, https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/performance-tests.md explains running corefx perf tests - you already know how to patch CoreFX with CoreCLR - this time it would of course be release built bits. |
@danmosemsft if i compile with my local clr bits(updated cleaned repos) i get compile error:
all errors are on |
This happens as breaking changes propagate through the system. This one will fix itself up once dotnet/corefx#30497 is merge. You can ignore this. To avoid this sort of build break, you would need to sync your coreclr repo back to the build that corefx is using: find the CoreCLR package version in corefx/dependencies.props and then find the hash that the package was built from at https://dotnet.myget.org/feed/dotnet-core/package/nuget/runtime.osx-x64.Microsoft.NETCore.Runtime.CoreCLR. We plan to have a better system for this. |
Ok i'll wait merge.
Understood..it means i need to wait to "test my local updates"...because on myget i can download only compiled package...no sources...isn't it?Or there is a way to my props file(working on win):
As i said i can wait...but it would be great be able to "work" also in this case(until better way), i don't want to waste precious spare time 😃. EDIT: Thank's a lot! |
Here is what the exact steps would be in this case:
|
Great!Thank's a lot! |
Tests code: [Benchmark]
[InlineData(20000)]
[InlineData(100000)]
[InlineData(1000000)]
public static void Remove_ValueType(long size)
{
Dictionary<long?, long?> collection = new Dictionary<long?, long?>();
long?[] items;
items = new long?[size * 10];
for (long i = 0; i < size * 10; ++i)
{
items[i] = i;
collection.Add(items[i], items[i]);
}
foreach (var iteration in Benchmark.Iterations)
using (iteration.StartMeasurement())
for (long i = 1; i < size; ++i)
collection.Remove(items[i]);
}
[Benchmark]
[InlineData(20000)]
[InlineData(100000)]
[InlineData(1000000)]
public static void Remove_ValueType_Out(long size)
{
Dictionary<long?, long?> collection = new Dictionary<long?, long?>();
long?[] items;
items = new long?[size * 10];
for (long i = 0; i < size * 10; ++i)
{
items[i] = i;
collection.Add(items[i], items[i]);
}
foreach (var iteration in Benchmark.Iterations)
using (iteration.StartMeasurement())
for (long i = 1; i < size; ++i)
collection.Remove(items[i], out long? value);
} Before
After
@AndyAyersMS @danmosemsft @jkotas these are perf test result...seems there is no great difference, let me know if code test code is correct(first time with "Microsoft xunit-performance framework". What i've done:
Done some times. |
@dotnet-bot test this please |
@MarcoRossignoli that sounds right as long as obviously you're patching with your coreclr bits each time. What I sometimes do is put in a Thread.Sleep(50) or something as a sanity-check to make sure I am successfully consuming the bits I expect (then re-measure without it!) |
Yep sure i did it but omitted from list(Thread.Sleep(50))! |
@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test please EDIT: @jkotas are these CI tests different from others?I tried to re-run but it doesn't work. |
The ARM failures are @dotnet/jit-contrib is there a way to access the test logs? It seems unlikely that a change in Dictionary could break this test although I don't see the failure in other PR's. |
Those tests have now been removed from the tests.lst files (#18569) |
Thanks @CarolEidt |
Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
closes https://github.com/dotnet/corefx/issues/30023
I used Ben's idea dotnet/corefx@6df07c4
Do we need some tests?It's not easy to repro...but if you've some fast way to break(maybe some old code used for FindEntry) i could test in local.
/cc @danmosemsft @stephentoub @benaadams
NB. A few years ago i lost hours to spot this bug(dump/WinDbg) in a third-party lib, 100% cpu on FindEntry.