-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random.Next is slow on checked runtime on ARM #52894
Comments
I took a look at it in PerfView and the hottest method is because it invokes double-to-string conversion all the time (which is slow mostly because of get_CurrentCulture) |
Tagging subscribers to this area: @tannergooding Issue DetailsThis is a follow-up to the investigation in #52031 We had a test that consistently run overtime on Linux on ARM whereas it was ok on all other platforms and architectures. Investigation shown that the culprit was calling for Random.Next in a loop (on checked runtime). Switching to Random.NextBytes changed timings dramatically. There are two pieces of code with the timings I've measured on Ubuntu 18.04 on an ARM64 machine on Checked CoreCLR runtime: int frameSize = 2 << 15;
byte[] message = new byte[frameSize * 10];
Random random = new(0);
for (int i = 0; i < message.Length; ++i)
{
message[i] = (byte)random.Next(maxValue: 10);
} Execution time: 4.8s int frameSize = 2 << 15;
byte[] message = new byte[frameSize * 10];
new Random(0).NextBytes(message);
for (int i = 0; i < message.Length; ++i)
{
message[i] %= 10;
} Execution time: 0.02s It might be beneficial to know what causes the first piece of code to run 200 times slower. cc @danmoseley P.S.: it may be worth mentioning that on release runtime, the first piece takes 0.03s.
|
Is checked runtime basically a |
In the context of CoreLib I'd say it's Release with Asserts 🙂 |
I bet that it does. The other environments are generally faster and so the slow-down was not enough to make the test time out. |
Is this a case that the new string formatting pattern would improve in future? |
Yes, it would improve it somewhat if we added Debug.Assert overloads that accept the new builders. It would still be many times slower than outlining the condition manually like what Egor has done in #52918 |
In theory, we could introduce a new Debug.Assert(result >= 0.0 && result < 1.0f, () => $"Expected 0.0 <= {result} < 1.0"); |
Just curious, is there a reason codegen can't special case |
That would require two things:
|
We could achieve essentially the same thing if we added overloads that took a special DebugAssertInterpolatedStringHandler. It would be preferred by the compiler, and we'd write it to short-circuit based on the bool condition. Thus when you wrote: Debug.Assert(condition, $"{a} != {b}"); The compiler would generate something like: var handler = new DebugAssertInterpolatedStringHandler(2, 4, condition, out bool success);
_ = success &&
handler.AppendFormatted(a) &&
handler.AppendLiteral(" != ") &&
handler.AppendFormatted(b);
Debug.Assert(condition, handler); It would be a little slower than manually outlining, but not much, and would automatically apply to all asserts of this form. And would be achieved entirely in library; no additional compiler or JIT work required. |
@stephentoub that's pretty appealing (seems it would have saved an investigation and PR here) -- do you plan to open an issue? (when you're back..) |
I opened #53211. |
This is a follow-up to the investigation in #52031
We had a test that consistently run overtime on Linux on ARM whereas it was ok on all other platforms and architectures. Investigation shown that the culprit was calling for Random.Next in a loop (on checked runtime). Switching to Random.NextBytes changed timings dramatically.
There are two pieces of code with the timings I've measured on Ubuntu 18.04 on an ARM64 machine on Checked CoreCLR runtime:
Execution time: 4.8s
Execution time: 0.02s
It might be beneficial to know what causes the first piece of code to run 200 times slower.
cc @danmoseley
P.S.: it may be worth mentioning that on release runtime, the first piece takes 0.03s.
The text was updated successfully, but these errors were encountered: