Skip to content

[Benchmarks][stdlib] Adding an extra benchmark for set isDisjoint for disjoint sets of different size #39265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 14, 2021

Conversation

LucianoPAlmeida
Copy link
Contributor

As for the improvement on #39263 we can really see much improvement because sets are same size.
So this adds a new smaller disjoint set and a benchmark for 0 overlap (disjoint) using those sets. And also make the benchmark run to run a.isDisjoint(b) and b.isDisjoint(a) to give a better overview of benchmarks with different size sets.

@LucianoPAlmeida
Copy link
Contributor Author

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance (x86_64): -O

Regression OLD NEW DELTA RATIO
Set.isDisjoint.Empty.Box 108 260 +140.7% 0.42x
Set.isDisjoint.Box25 366 861 +135.2% 0.43x (?)
Set.isDisjoint.Int50 268 607 +126.5% 0.44x (?)
Set.isDisjoint.Int25 268 603 +125.0% 0.44x (?)
Set.isDisjoint.Int0 323 681 +110.8% 0.47x
Set.isDisjoint.Empty.Int 110 225 +104.5% 0.49x
Set.isDisjoint.Box0 672 1359 +102.2% 0.49x
Set.isDisjoint.Int100 268 537 +100.4% 0.50x
DictionaryOfAnyHashableStrings_insert 3318 5698 +71.7% 0.58x (?)
Set.isDisjoint.Box.Empty 158 260 +64.6% 0.61x
Set.isDisjoint.Int.Empty 142 227 +59.9% 0.63x
DictionaryKeysContainsNative 22 26 +18.2% 0.85x (?)
DictionaryBridgeToObjC_BulkAccess 148 160 +8.1% 0.93x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 937 1005 969
Set.isDisjoint.Smaller.Int0 475 479 477

Code size: -O

Regression OLD NEW DELTA RATIO
SetTests.o 127149 128877 +1.4% 0.99x

Performance (x86_64): -Osize

Regression OLD NEW DELTA RATIO
Set.isDisjoint.Empty.Box 112 251 +124.1% 0.45x
Set.isDisjoint.Empty.Int 110 240 +118.2% 0.46x
Set.isDisjoint.Int0 323 680 +110.5% 0.48x
Set.isDisjoint.Box0 674 1365 +102.5% 0.49x
Set.isDisjoint.Int100 268 529 +97.4% 0.51x
Set.isDisjoint.Int25 348 616 +77.0% 0.56x
Set.isDisjoint.Int50 340 600 +76.5% 0.57x
Set.isDisjoint.Int.Empty 144 244 +69.4% 0.59x
Set.isDisjoint.Box25 509 861 +69.2% 0.59x
Set.isDisjoint.Box.Empty 155 248 +60.0% 0.63x
FlattenListFlatMap 4449 6688 +50.3% 0.67x (?)
FlattenListLoop 1702 2543 +49.4% 0.67x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 849 870 856
Set.isDisjoint.Smaller.Int0 427 456 437

Code size: -Osize

Regression OLD NEW DELTA RATIO
SetTests.o 106476 108041 +1.5% 0.99x

Performance (x86_64): -Onone

Regression OLD NEW DELTA RATIO
Set.isDisjoint.Empty.Int 577 1254 +117.3% 0.46x
Set.isDisjoint.Empty.Box 613 1284 +109.5% 0.48x (?)
Set.isDisjoint.Int0 818 1699 +107.7% 0.48x
Set.isDisjoint.Box0 2664 5501 +106.5% 0.48x
Set.isDisjoint.Int100 895 1722 +92.4% 0.52x
Set.isDisjoint.Int50 1083 1960 +81.0% 0.55x
Set.isDisjoint.Int25 1122 1986 +77.0% 0.56x
Set.isDisjoint.Int.Empty 740 1252 +69.2% 0.59x
Set.isDisjoint.Box.Empty 779 1275 +63.7% 0.61x (?)
Set.isDisjoint.Box25 2426 3855 +58.9% 0.63x
StringToDataLargeUnicode 6800 7350 +8.1% 0.93x (?)
StringToDataMedium 6350 6850 +7.9% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
ArrayAppendGenericStructs 2140 1370 -36.0% 1.56x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 3384 3424 3398
Set.isDisjoint.Smaller.Int0 1055 1057 1056

Code size: -swiftlibs

Benchmark Check Report
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One of these days we'll need to take a long, hard look at how an innocent addition like this can trigger such large swings in unchanged code paths within the same module.

Argh, my bad, I missed that this changes those paths. (Reviewing PRs on my phone is not a good idea. 🙈)

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to leave the existing benchmarks as is -- changing them would interfere with performance tracking.

(The new code is great for the new benchmarks, though!)

@LucianoPAlmeida
Copy link
Contributor Author

Argh, my bad, I missed that this changes those paths. (Reviewing PRs on my phone is not a good idea. 🙈)

No worries, Thanks for the review!

@LucianoPAlmeida
Copy link
Contributor Author

We'll need to leave the existing benchmarks as is -- changing them would interfere with performance tracking.

(The new code is great for the new benchmarks, though!)

Just fixed!
Created new benchmark function to run disjoint sets of different sizes, so let me know if this is good :)

@LucianoPAlmeida
Copy link
Contributor Author

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance (x86_64): -O

Regression OLD NEW DELTA RATIO
ArrayAppendGenericStructs 1330 2490 +87.2% 0.53x (?)
FlattenListFlatMap 3956 6662 +68.4% 0.59x (?)
DictionaryOfAnyHashableStrings_insert 3388 5684 +67.8% 0.60x
FlattenListLoop 1629 2547 +56.4% 0.64x (?)
Set.isDisjoint.Box25 355 515 +45.1% 0.69x (?)
Set.isDisjoint.Int25 262 344 +31.3% 0.76x (?)
DictionaryKeysContainsNative 21 25 +19.0% 0.84x (?)
StringRemoveDupes 275 299 +8.7% 0.92x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 954 954 954
Set.isDisjoint.Smaller.Int0 411 420 414

Code size: -O

Regression OLD NEW DELTA RATIO
SetTests.o 127149 130893 +2.9% 0.97x

Performance (x86_64): -Osize

Regression OLD NEW DELTA RATIO
FlattenListLoop 1693 2547 +50.4% 0.66x (?)
ArrayAppendGenericStructs 2150 2430 +13.0% 0.88x (?)
String.data.Medium 96 108 +12.5% 0.89x (?)
FlattenListFlatMap 6179 6766 +9.5% 0.91x (?)
Set.isDisjoint.Box.Empty 155 169 +9.0% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeStubFromNSDate 7440 6450 -13.3% 1.15x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 835 836 836
Set.isDisjoint.Smaller.Int0 419 426 422

Code size: -Osize

Regression OLD NEW DELTA RATIO
SetTests.o 106476 109901 +3.2% 0.97x

Performance (x86_64): -Onone

Regression OLD NEW DELTA RATIO
ArrayAppendGenericStructs 1370 2060 +50.4% 0.67x (?)
 
Added MIN MAX MEAN MAX_RSS
Set.isDisjoint.Smaller.Box0 3410 3504 3450
Set.isDisjoint.Smaller.Int0 1055 1064 1060

Code size: -swiftlibs

Benchmark Check Report
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@LucianoPAlmeida
Copy link
Contributor Author

@swift-ci Please smoke test

@LucianoPAlmeida
Copy link
Contributor Author

@lorentey Just for curiosity, in this on the how to read: Unexpected regressions which are marked with '(?)' are probably noise. what it is exactly this noise and generally what causes such results?

@LucianoPAlmeida
Copy link
Contributor Author

@lorentey, friendly ping :)
To know if there is any other change or it is good to land?

@lorentey
Copy link
Member

Just for curiosity, in this on the how to read: Unexpected regressions which are marked with '(?)' are probably noise. what it is exactly this noise and generally what causes such results?

It's a heuristic that puts the mark on (almost) all results that are overlapping with the old run. Here is the logic behind it:

        # Indication of dubious changes: when result's MIN falls inside the
        # (MIN, MAX) interval of result they are being compared with.
        self.is_dubious = (old.min < new.min and new.min < old.max) or (
            new.min < old.min and old.min < new.max
        )

I believe this will tend to shame benchmarks that have a high variability. (I'm not nearly good enough with statistics to tell if it's a good one, but it seems to be working relatively well!)

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you for working on this.

Regression OLD NEW DELTA RATIO
Set.isDisjoint.Box25 355 515 +45.1% 0.69x (?)
Set.isDisjoint.Int25 262 344 +31.3% 0.76x (?)

I wonder if the nondeterministic hashing config was lost from the benchmarking environment. This could also be an instruction cache artifact from the functions getting slightly bumped in the emitted binary. Well, it's an investigation for another day. :-/

@lorentey lorentey merged commit e1e3824 into swiftlang:main Sep 14, 2021
@LucianoPAlmeida LucianoPAlmeida deleted the set-disjoint-benchmarks branch September 14, 2021 20:07
@LucianoPAlmeida
Copy link
Contributor Author

It's a heuristic that puts the mark on (almost) all results that are overlapping with the old run. Here is the logic behind it:

Ah interesting, thank you for the explanation!

I believe this will tend to shame benchmarks that have a high variability. (I'm not nearly good enough with statistics to tell if it's a good one, but it seems to be working relatively well!)

Variability would be across multiple runs with same input values? I guess that given that most of the methods being benchmarked are deterministic, variability across multiple runs may come from hardware state e.g. variability of CPU cache hits and misses in different runs...
Sorry for to many questions and pings, I guess benchmarking at some is very complicated because is to many variables :)

@lorentey
Copy link
Member

lorentey commented Sep 15, 2021

Variability would be across multiple runs with same input values?

Yep -- the benchmarks are measured over multiple iterations, scaled so that each measurement takes some fixed amount of time. I believe the time interval is set small to discourage/prevent context switches in the middle of a benchmark, but measuring over multiple iterations also reduces some of the noise by averaging it out.

Then iirc these measurements are repeated multiple times to collect a nice sample. The min/max/avg/stddev values reported come from these samples.

(Beware I'm sure I got at least some of these details wrong -- the truth is in the code. 😅 )

@LucianoPAlmeida
Copy link
Contributor Author

Interesting to know, Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants