Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero gc labels lookup #486

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Zero gc labels lookup #486

wants to merge 2 commits into from

Conversation

brian-brazil
Copy link
Contributor

@njhill @franz1981 @Falland

I've finally had time to dig into the various PRs (#445 #459 #460), and what I've done is taken the benchmarks from 460 and used 445 as a base and then eliminated the remaining allocs from it. 460 still has notably better performance, however I remain uncomfortable with a bespoke hashtable implementation and Java's profiling tools aren't giving me enough to figure out why it's slower.

As a side effect, it's also simpler to add other numbers of labels in the future (not that even 4 should be common in the first place).

franz1981 and others added 2 commits December 18, 2018 16:25
Introduced pooling of label Names to reduce garbage
in the hot path, updated benchmarks to measure it,
improved SimpleCollector creation when are used
labels with no label names or with a single element.
Introduced a new ArrayList implementation with
faster hashCode/equals to allow faster lookups.

Signed-off-by: Francesco Nigro <nigro.fra@gmail.com>
@brian-brazil
Copy link
Contributor Author

Existing:

Benchmark                                                                  Mode  Cnt     Score   Error   Units
LabelsToChildLookupBenchmark.baseline                                      avgt         11.091           ns/op
LabelsToChildLookupBenchmark.baseline:·gc.alloc.rate                       avgt          0.001          MB/sec
LabelsToChildLookupBenchmark.baseline:·gc.alloc.rate.norm                  avgt         ≈ 10⁻⁵            B/op
LabelsToChildLookupBenchmark.baseline:·gc.count                            avgt            ≈ 0          counts
LabelsToChildLookupBenchmark.fiveLabels                                    avgt         75.175           ns/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.alloc.rate                     avgt       1636.069          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.alloc.rate.norm                avgt        128.000            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Eden_Space            avgt       1794.610          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Eden_Space.norm       avgt        140.404            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Survivor_Space        avgt          0.031          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Survivor_Space.norm   avgt          0.002            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.count                          avgt          3.000          counts
LabelsToChildLookupBenchmark.fiveLabels:·gc.time                           avgt          3.000              ms
LabelsToChildLookupBenchmark.fourLabels                                    avgt         52.120           ns/op
LabelsToChildLookupBenchmark.fourLabels:·gc.alloc.rate                     avgt       1021.469          MB/sec
LabelsToChildLookupBenchmark.fourLabels:·gc.alloc.rate.norm                avgt         56.000            B/op
LabelsToChildLookupBenchmark.fourLabels:·gc.churn.PS_Eden_Space            avgt       1003.297          MB/sec
LabelsToChildLookupBenchmark.fourLabels:·gc.churn.PS_Eden_Space.norm       avgt         55.004            B/op
LabelsToChildLookupBenchmark.fourLabels:·gc.churn.PS_Survivor_Space        avgt          0.062          MB/sec
LabelsToChildLookupBenchmark.fourLabels:·gc.churn.PS_Survivor_Space.norm   avgt          0.003            B/op
LabelsToChildLookupBenchmark.fourLabels:·gc.count                          avgt          2.000          counts
LabelsToChildLookupBenchmark.fourLabels:·gc.time                           avgt          2.000              ms
LabelsToChildLookupBenchmark.oneLabel                                      avgt         36.335           ns/op
LabelsToChildLookupBenchmark.oneLabel:·gc.alloc.rate                       avgt       1267.213          MB/sec
LabelsToChildLookupBenchmark.oneLabel:·gc.alloc.rate.norm                  avgt         48.000            B/op
LabelsToChildLookupBenchmark.oneLabel:·gc.churn.PS_Eden_Space              avgt        998.861          MB/sec
LabelsToChildLookupBenchmark.oneLabel:·gc.churn.PS_Eden_Space.norm         avgt         37.835            B/op
LabelsToChildLookupBenchmark.oneLabel:·gc.churn.PS_Survivor_Space          avgt          0.031          MB/sec
LabelsToChildLookupBenchmark.oneLabel:·gc.churn.PS_Survivor_Space.norm     avgt          0.001            B/op
LabelsToChildLookupBenchmark.oneLabel:·gc.count                            avgt          2.000          counts
LabelsToChildLookupBenchmark.oneLabel:·gc.time                             avgt          2.000              ms
LabelsToChildLookupBenchmark.threeLabels                                   avgt         41.139           ns/op
LabelsToChildLookupBenchmark.threeLabels:·gc.alloc.rate                    avgt       1296.388          MB/sec
LabelsToChildLookupBenchmark.threeLabels:·gc.alloc.rate.norm               avgt         56.000            B/op
LabelsToChildLookupBenchmark.threeLabels:·gc.churn.PS_Eden_Space           avgt       1006.037          MB/sec
LabelsToChildLookupBenchmark.threeLabels:·gc.churn.PS_Eden_Space.norm      avgt         43.458            B/op
LabelsToChildLookupBenchmark.threeLabels:·gc.churn.PS_Survivor_Space       avgt          0.062          MB/sec
LabelsToChildLookupBenchmark.threeLabels:·gc.churn.PS_Survivor_Space.norm  avgt          0.003            B/op
LabelsToChildLookupBenchmark.threeLabels:·gc.count                         avgt          2.000          counts
LabelsToChildLookupBenchmark.threeLabels:·gc.time                          avgt          1.000              ms
LabelsToChildLookupBenchmark.twoLabels                                     avgt         44.241           ns/op
LabelsToChildLookupBenchmark.twoLabels:·gc.alloc.rate                      avgt       1031.930          MB/sec
LabelsToChildLookupBenchmark.twoLabels:·gc.alloc.rate.norm                 avgt         48.000            B/op
LabelsToChildLookupBenchmark.twoLabels:·gc.churn.PS_Eden_Space             avgt       1004.470          MB/sec
LabelsToChildLookupBenchmark.twoLabels:·gc.churn.PS_Eden_Space.norm        avgt         46.723            B/op
LabelsToChildLookupBenchmark.twoLabels:·gc.churn.PS_Survivor_Space         avgt          0.031          MB/sec
LabelsToChildLookupBenchmark.twoLabels:·gc.churn.PS_Survivor_Space.norm    avgt          0.001            B/op
LabelsToChildLookupBenchmark.twoLabels:·gc.count                           avgt          2.000          counts
LabelsToChildLookupBenchmark.twoLabels:·gc.time                            avgt          2.000              ms

Now:

-wi 1 -i 1 -f 1 -t 2 -prof gc

Benchmark                                                                 Mode  Cnt    Score   Error   Units
LabelsToChildLookupBenchmark.baseline                                     avgt        11.743           ns/op
LabelsToChildLookupBenchmark.baseline:·gc.alloc.rate                      avgt         0.001          MB/sec
LabelsToChildLookupBenchmark.baseline:·gc.alloc.rate.norm                 avgt        ≈ 10⁻⁵            B/op
LabelsToChildLookupBenchmark.baseline:·gc.count                           avgt           ≈ 0          counts
LabelsToChildLookupBenchmark.fiveLabels                                   avgt        68.673           ns/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.alloc.rate                    avgt       562.481          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.alloc.rate.norm               avgt        40.000            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Eden_Space           avgt       501.133          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Eden_Space.norm      avgt        35.637            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Survivor_Space       avgt         1.336          MB/sec
LabelsToChildLookupBenchmark.fiveLabels:·gc.churn.PS_Survivor_Space.norm  avgt         0.095            B/op
LabelsToChildLookupBenchmark.fiveLabels:·gc.count                         avgt         2.000          counts
LabelsToChildLookupBenchmark.fiveLabels:·gc.time                          avgt         2.000              ms
LabelsToChildLookupBenchmark.fourLabels                                   avgt        48.986           ns/op
LabelsToChildLookupBenchmark.fourLabels:·gc.alloc.rate                    avgt         0.001          MB/sec
LabelsToChildLookupBenchmark.fourLabels:·gc.alloc.rate.norm               avgt        ≈ 10⁻⁴            B/op
LabelsToChildLookupBenchmark.fourLabels:·gc.count                         avgt           ≈ 0          counts
LabelsToChildLookupBenchmark.oneLabel                                     avgt        34.478           ns/op
LabelsToChildLookupBenchmark.oneLabel:·gc.alloc.rate                      avgt         0.001          MB/sec
LabelsToChildLookupBenchmark.oneLabel:·gc.alloc.rate.norm                 avgt        ≈ 10⁻⁴            B/op
LabelsToChildLookupBenchmark.oneLabel:·gc.count                           avgt           ≈ 0          counts
LabelsToChildLookupBenchmark.threeLabels                                  avgt        41.966           ns/op
LabelsToChildLookupBenchmark.threeLabels:·gc.alloc.rate                   avgt         0.001          MB/sec
LabelsToChildLookupBenchmark.threeLabels:·gc.alloc.rate.norm              avgt        ≈ 10⁻⁴            B/op
LabelsToChildLookupBenchmark.threeLabels:·gc.count                        avgt           ≈ 0          counts
LabelsToChildLookupBenchmark.twoLabels                                    avgt        43.499           ns/op
LabelsToChildLookupBenchmark.twoLabels:·gc.alloc.rate                     avgt         0.001          MB/sec
LabelsToChildLookupBenchmark.twoLabels:·gc.alloc.rate.norm                avgt        ≈ 10⁻⁴            B/op
LabelsToChildLookupBenchmark.twoLabels:·gc.count                          avgt           ≈ 0          counts

@njhill
Copy link

njhill commented Nov 22, 2019

@brian-brazil I've opened #514, see what you think. The thread-local pooling logic in this PR could easily be encapsulated as an implementation of ConcurrentChildMap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants