Implementation of Lemire's nearly divisionless method #79790

mla-alm · 2022-12-17T14:31:49Z

dotnet-issue-labeler · 2022-12-17T14:31:55Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

dnfadmin · 2022-12-17T14:32:03Z

All CLA requirements met.

ghost · 2022-12-17T15:43:26Z

Tagging subscribers to this area: @dotnet/area-system-runtime
See info in area-owners.md if you want to be subscribed.

Issue Details

#75395

Author:	mla-alm
Assignees:	-
Labels:	`area-System.Runtime`, `community-contribution`
Milestone:	-

…AsExpected

danmoseley · 2022-12-18T19:00:03Z

Could you please add benchmark(s) to dotnet/performance so we protect your work when it goes in? Or is coverage good enough already?

mla-alm · 2022-12-21T17:49:55Z

Which numbers or intervals to pick for benchmarks decides which algorithm is "optimal". IMO it could be interesting to include the edge cases:
Next(int.MinValue, 1)
NextInt64(long.MinValue, 1)
However these would actually point towards an implementation like swiftlang/swift@87b3f60.
For this implementation I guess the benchmarks are sufficient.

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs

src/libraries/System.Runtime.Extensions/tests/System/Random.cs

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro256StarStarImpl.cs

stephentoub · 2023-01-31T16:17:30Z

/benchmark microbenchmarks aspnet-perf-win runtime --variable filter=System.Tests.Perf_Random*

pr-benchmarks · 2023-01-31T16:19:26Z

Benchmark started for microbenchmarks on aspnet-perf-win with runtime and arguments --variable filter=System.Tests.Perf_Random*. Logs: link

pr-benchmarks · 2023-01-31T16:54:35Z

microbenchmarks - aspnet-perf-win

| benchmark               | mean (microbenchmarks.base) | mean (microbenchmarks.pr) | ratio | allocated (microbenchmarks.base) | allocated (microbenchmarks.pr) | ratio |
| ----------------------- | --------------------------- | ------------------------- | ----- | -------------------------------- | ------------------------------ | ----- |
| ctor                    |                    80.83 ns |                  81.42 ns |  1.01 |                             72 B |                           72 B |  1.00 |
| ctor_seeded             |                    291.5 ns |                  292.1 ns |  1.00 |                            304 B |                          304 B |  1.00 |
| Next                    |                    7.939 ns |                  8.179 ns |  1.03 |                              0 B |                            0 B |       |
| Next_int                |                    8.980 ns |                  9.050 ns |  1.01 |                              0 B |                            0 B |       |
| Next_int_int            |                    9.059 ns |                  9.905 ns |  1.09 |                              0 B |                            0 B |       |
| Next_int_int_unseeded   |                    7.311 ns |                  2.102 ns |  0.29 |                              0 B |                            0 B |       |
| Next_int_unseeded       |                    7.443 ns |                  2.261 ns |  0.30 |                              0 B |                            0 B |       |
| Next_long               |                    37.09 ns |                  35.41 ns |  0.95 |                              0 B |                            0 B |       |
| Next_long_long          |                    37.65 ns |                  39.01 ns |  1.04 |                              0 B |                            0 B |       |
| Next_long_long_unseeded |                    8.304 ns |                  2.659 ns |  0.32 |                              0 B |                            0 B |       |
| Next_long_unseeded      |                    4.268 ns |                  2.549 ns |  0.60 |                              0 B |                            0 B |       |
| Next_unseeded           |                    1.844 ns |                  1.746 ns |  0.95 |                              0 B |                            0 B |       |
| NextBytes               |                      5.3 μs |                    5.6 μs |  1.05 |                              0 B |                            0 B |       |
| NextBytes_span          |                      5.5 μs |                    5.6 μs |  1.01 |                              0 B |                            0 B |       |
| NextBytes_span_unseeded |                    119.1 ns |                  118.8 ns |  1.00 |                              0 B |                            0 B |       |
| NextBytes_unseeded      |                    119.1 ns |                  119.0 ns |  1.00 |                              0 B |                            0 B |       |
| NextDouble              |                    7.980 ns |                  7.980 ns |  1.00 |                              0 B |                            0 B |       |
| NextDouble_unseeded     |                    1.991 ns |                  1.961 ns |  0.98 |                              0 B |                            0 B |       |
| NextSingle              |                    7.995 ns |                  7.999 ns |  1.00 |                              0 B |                            0 B |       |
| NextSingle_unseeded     |                    2.000 ns |                  1.977 ns |  0.99 |                              0 B |                            0 B |       |

stephentoub

The implementation and 64-bit perf look good. I'm building locally for 32-bit to check that.

stephentoub · 2023-01-31T17:47:13Z

This Xoshiro128 implementation is only used on 32-bit. I saw some measurements were done, but I believe those were on 64-bit... have we measured how this performs on 32-bit?

True, I have only tested on 64-bit.

On 32-bit, I get numbers like this:

Method	Toolchain	Mean	Error	StdDev	Median	Min	Max	Ratio
Next_int_unseeded	\main_x86\corerun.exe	13.130 ns	0.4286 ns	0.4936 ns	12.949 ns	12.475 ns	14.104 ns	1.00
Next_int_unseeded	\pr_x86\corerun.exe	5.612 ns	0.0956 ns	0.0847 ns	5.602 ns	5.489 ns	5.813 ns	0.43

Next_int_int_unseeded	\main_x86\corerun.exe	12.036 ns	0.1935 ns	0.1616 ns	11.969 ns	11.864 ns	12.342 ns	1.00
Next_int_int_unseeded	\pr_x86\corerun.exe	5.296 ns	0.0605 ns	0.0565 ns	5.288 ns	5.208 ns	5.423 ns	0.44

Next_long_unseeded	\main_x86\corerun.exe	11.874 ns	0.2669 ns	0.2496 ns	11.850 ns	11.462 ns	12.305 ns	1.00
Next_long_unseeded	\pr_x86\corerun.exe	14.409 ns	0.1976 ns	0.1650 ns	14.360 ns	14.121 ns	14.758 ns	1.22

Next_long_long_unseeded	\main_x86\corerun.exe	15.087 ns	0.2617 ns	0.2320 ns	15.085 ns	14.771 ns	15.666 ns	1.00
Next_long_long_unseeded	\pr_x86\corerun.exe	18.001 ns	0.2120 ns	0.1983 ns	17.967 ns	17.711 ns	18.399 ns	1.19

So, at least according to these benchmarks, this PR represents a potential regression on 32-bit, in particular for NextInt64.

Can you measure on 32-bit as well, @mla-alm?

mla-alm · 2023-01-31T19:34:28Z

I am running on Linux, so cannot measure on 32-bit.

public long Next_long_long_unseeded() => _randomUnseeded.NextInt64(100, 10000);
So this benchmark for the 32-bit is in reality only handling the case of an int on the 'main' test.

Maybe it is better to be conservative and leave the implementation (at least the NextInt64) as is in Xoshiro128?

jeffhandley · 2023-02-11T02:24:51Z

Maybe it is better to be conservative and leave the implementation (at least the NextInt64) as is in Xoshiro128?

@stephentoub What do you think about that idea?

stephentoub · 2023-02-15T23:03:11Z

Yeah, let's leave the Xoshiro128's NextInt64 unmodified for now, other than a comment indicating why it's not getting the same treatment as the others.

mla-alm · 2023-02-16T17:22:04Z

Xoshiro128's NextInt64 is as it was before. And comment added.

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs

…128StarStarImpl.cs

stephentoub

Thanks!

danmoseley · 2023-02-18T14:59:08Z

@mla-alm Thank you! Perhaps you'd be interested in another contribution somewhere?

EgorBo · 2023-02-21T15:26:49Z

Improvements on x64 dotnet/perf-autofiling-issues#13151

mla-alm · 2023-03-06T15:27:47Z

Hi @danmoseley.
Sure I might. Do you have any suggestions?
Otherwise I might look around, when I find the time.

EgorBo · 2023-03-28T16:51:19Z

Improvements on win-x64:

[Perf] Windows/x64: 11 Improvements on 2/18/2023 3:00:06 AM perf-autofiling-issues#13255

danmoseley · 2023-03-28T18:47:06Z

mla-alm

See your perf win confirmed above!

In terms of what to take on next -- any issue that's help wanted (or possibly others)

https://github.com/dotnet/runtime/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22
https://github.com/dotnet/aspnetcore/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22

if you have a particular area you're interested in, you could ask them

https://github.com/dotnet/runtime/blob/main/docs/area-owners.md

or ask here or in discussions.

we are always happy to have new regular contributors.

mla-alm added 3 commits December 15, 2022 20:10

Lemire implementation

c5dadb4

Cleanup

71b3bb3

Article reference

405c976

ghost added the community-contribution Indicates that the PR has been added by a community member label Dec 17, 2022

mla-alm mentioned this pull request Dec 17, 2022

Try improved method for generating bounded random integers #75395

Closed

Fix

a3c263f

teo-tsirpanis added the area-System.Runtime label Dec 17, 2022

Fixes

96b5e91

This was referenced Dec 17, 2022

Precondition failure: File has not had execution verified #79439

Closed

[wasm] Library tests failing during linking for AOT - SIGKILL #79569

Closed

mla-alm added 3 commits December 18, 2022 08:03

Comment out implementation specific tests in Xoshiro_AlgorithmBehaves…

600b9d5

…AsExpected

Fix

e5c4536

Merge remote-tracking branch 'origin/main' into mla-alm/lemire

8a7dbdf

build-analysis bot mentioned this pull request Dec 18, 2022

Tracking issue for CI build timeouts #76454

Closed

runfoapp bot mentioned this pull request Dec 19, 2022

Infra improvements for Helix #68176

Closed

mla-alm added 2 commits December 21, 2022 18:41

Reenable sufficient checks for Xoshiro_AlgorithmBehavesAsExpected

2dd1a78

Merge remote-tracking branch 'origin/main' into mla-alm/lemire

1b3c19b

Fix

f2a11d0

This was referenced Dec 22, 2022

Build fails with "eng/common/tools.sh: line 474: 537 Segmentation fault" #76759

Closed

emcc received SIGKILL #79874

Closed

danmoseley reviewed Dec 22, 2022

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs Outdated Show resolved Hide resolved

danmoseley reviewed Dec 22, 2022

View reviewed changes

src/libraries/System.Runtime.Extensions/tests/System/Random.cs Show resolved Hide resolved

danmoseley reviewed Dec 22, 2022

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs Outdated Show resolved Hide resolved

danmoseley reviewed Dec 22, 2022

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro256StarStarImpl.cs Outdated Show resolved Hide resolved

Add third party notice

a68e011

Merge remote-tracking branch 'origin/mla-alm/lemire' into mla-alm/lemire

4370688

stephentoub reviewed Jan 31, 2023

View reviewed changes

mla-alm added 2 commits February 16, 2023 18:18

Reverting NextInt64 on Xoshiro128

ef38e9d

Merge branch 'dotnet:main' into mla-alm/lemire

33654d2

mla-alm added 2 commits February 16, 2023 18:31

Adjust test

95b4b42

Merge remote-tracking branch 'origin/mla-alm/lemire' into mla-alm/lemire

b28062c

build-analysis bot mentioned this pull request Feb 16, 2023

OSX infra issue - prereq check for 'pkg-config' missing #82240

Closed

Merge branch 'dotnet:main' into mla-alm/lemire

6304f68

stephentoub reviewed Feb 17, 2023

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro128StarStarImpl.cs Outdated Show resolved Hide resolved

Update src/libraries/System.Private.CoreLib/src/System/Random.Xoshiro…

608a801

…128StarStarImpl.cs

stephentoub approved these changes Feb 17, 2023

View reviewed changes

danmoseley mentioned this pull request Feb 17, 2023

Improvements of Random.GetItems<T> performance #82286

Open

build-analysis bot mentioned this pull request Feb 17, 2023

[QUIC] Missing libmsquic on tested platforms #81901

Closed

4 tasks

stephentoub merged commit 5ec100a into dotnet:main Feb 18, 2023

lewing mentioned this pull request Feb 21, 2023

[Perf] Linux/x64: 5 Improvements on 2/18/2023 3:00:06 AM dotnet/perf-autofiling-issues#13323

Open

mla-alm deleted the mla-alm/lemire branch March 19, 2023 05:52

ghost locked as resolved and limited conversation to collaborators Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of Lemire's nearly divisionless method #79790

Implementation of Lemire's nearly divisionless method #79790

mla-alm commented Dec 17, 2022

dotnet-issue-labeler bot commented Dec 17, 2022

dnfadmin commented Dec 17, 2022 •

edited

Loading

ghost commented Dec 17, 2022

danmoseley commented Dec 18, 2022

mla-alm commented Dec 21, 2022

stephentoub commented Jan 31, 2023

pr-benchmarks bot commented Jan 31, 2023

pr-benchmarks bot commented Jan 31, 2023

stephentoub left a comment

stephentoub commented Jan 31, 2023 •

edited

Loading

mla-alm commented Jan 31, 2023

jeffhandley commented Feb 11, 2023

stephentoub commented Feb 15, 2023

mla-alm commented Feb 16, 2023

stephentoub left a comment

danmoseley commented Feb 18, 2023

EgorBo commented Feb 21, 2023

mla-alm commented Mar 6, 2023

EgorBo commented Mar 28, 2023

danmoseley commented Mar 28, 2023

Implementation of Lemire's nearly divisionless method #79790

Implementation of Lemire's nearly divisionless method #79790

Conversation

mla-alm commented Dec 17, 2022

dotnet-issue-labeler bot commented Dec 17, 2022

dnfadmin commented Dec 17, 2022 • edited Loading

ghost commented Dec 17, 2022

danmoseley commented Dec 18, 2022

mla-alm commented Dec 21, 2022

stephentoub commented Jan 31, 2023

pr-benchmarks bot commented Jan 31, 2023

pr-benchmarks bot commented Jan 31, 2023

stephentoub left a comment

Choose a reason for hiding this comment

stephentoub commented Jan 31, 2023 • edited Loading

mla-alm commented Jan 31, 2023

jeffhandley commented Feb 11, 2023

stephentoub commented Feb 15, 2023

mla-alm commented Feb 16, 2023

stephentoub left a comment

Choose a reason for hiding this comment

danmoseley commented Feb 18, 2023

EgorBo commented Feb 21, 2023

mla-alm commented Mar 6, 2023

EgorBo commented Mar 28, 2023

danmoseley commented Mar 28, 2023

dnfadmin commented Dec 17, 2022 •

edited

Loading

stephentoub commented Jan 31, 2023 •

edited

Loading