Testing out a different HashSet implementation #5

hmottestad · 2023-10-30T11:45:00Z

Before

Benchmark                                                            Mode  Cnt     Score     Error  Units
BasicProcessingAlgorithmsBenchmark.compactDatagovbeDcat              avgt    5  2136.886 ± 117.612  ms/op
BasicProcessingAlgorithmsBenchmark.compactDatagovbeDcatEmptyContext  avgt    5  1254.016 ±  45.229  ms/op
BasicProcessingAlgorithmsBenchmark.expandDatagovbeDcatFromCompact    avgt    5   376.505 ±  21.036  ms/op
BasicProcessingAlgorithmsBenchmark.expandDatagovbeDcatFromFlatten    avgt    5   382.821 ±  14.753  ms/op
BasicProcessingAlgorithmsBenchmark.flattenDatagovbeDcat              avgt    5   830.187 ±  74.280  ms/op
BasicProcessingAlgorithmsBenchmark.flattenDatagovbeDcatFromCompact   avgt    5  4268.813 ± 318.006  ms/op
ToRdfLargeFilesBenchmark.datagovbeDcat                               avgt    5  1432.005 ±  62.780  ms/op
ToRdfSmallFilesBenchmark.csiro                                       avgt    5     0.079 ±   0.002  ms/op
ToRdfSmallFilesBenchmark.difiDataset                                 avgt    5     1.042 ±   0.009  ms/op
ToRdfSmallFilesBenchmark.geonorge                                    avgt    5     0.008 ±   0.003  ms/op
ToRdfSmallFilesBenchmark.schemaExample1                              avgt    5     0.869 ±   0.020  ms/op
ToRdfSmallFilesBenchmark.schemaExample2                              avgt    5     0.854 ±   0.022  ms/op
ToRdfSmallFilesBenchmark.schemaExample3                              avgt    5     0.888 ±   0.042  ms/op
ToRdfSmallFilesBenchmark.schemaExample4                              avgt    5     0.890 ±   0.022  ms/op
ToRdfSmallFilesBenchmark.schemaExtBib                                avgt    5     1.059 ±   0.027  ms/op
ToRdfSmallFilesBenchmark.schemaExtHealthLifeSci                      avgt    5     2.724 ±   0.024  ms/op
ToRdfSmallFilesBenchmark.schemaExtMeta                               avgt    5     0.948 ±   0.196  ms/op

# Benchmark: no.hasmac.jsonld.benchmark.OOMBenchmark.datagovbeDcatToRdf

# Run progress: 0.59% complete, ETA 1 days, 06:39:54
# Fork: 1 of 1
Iteration   1: 1919.005 ms/op
Iteration   2: 822.951 ms/op
Iteration   3: 760.541 ms/op
Iteration   4: 896.373 ms/op
Iteration   5: 894.446 ms/op
Iteration   6: 757.024 ms/op
Iteration   7: 762.095 ms/op
Iteration   8: Terminating due to java.lang.OutOfMemoryError: Java heap space

After

Benchmark                                                            Mode  Cnt     Score     Error  Units
BasicProcessingAlgorithmsBenchmark.compactDatagovbeDcat              avgt    5  2121.251 ±  39.567  ms/op
BasicProcessingAlgorithmsBenchmark.compactDatagovbeDcatEmptyContext  avgt    5  1210.684 ± 122.861  ms/op
BasicProcessingAlgorithmsBenchmark.expandDatagovbeDcatFromCompact    avgt    5   385.342 ±  16.798  ms/op
BasicProcessingAlgorithmsBenchmark.expandDatagovbeDcatFromFlatten    avgt    5   406.715 ±  18.744  ms/op
BasicProcessingAlgorithmsBenchmark.flattenDatagovbeDcat              avgt    5   839.252 ±  22.937  ms/op
BasicProcessingAlgorithmsBenchmark.flattenDatagovbeDcatFromCompact   avgt    5  4554.356 ± 206.368  ms/op
ToRdfLargeFilesBenchmark.datagovbeDcat                               avgt    5  1136.119 ±  88.865  ms/op
ToRdfSmallFilesBenchmark.csiro                                       avgt    5     0.085 ±   0.001  ms/op
ToRdfSmallFilesBenchmark.difiDataset                                 avgt    5     1.210 ±   0.006  ms/op
ToRdfSmallFilesBenchmark.geonorge                                    avgt    5     0.009 ±   0.001  ms/op
ToRdfSmallFilesBenchmark.schemaExample1                              avgt    5     0.880 ±   0.008  ms/op
ToRdfSmallFilesBenchmark.schemaExample2                              avgt    5     0.867 ±   0.005  ms/op
ToRdfSmallFilesBenchmark.schemaExample3                              avgt    5     0.909 ±   0.004  ms/op
ToRdfSmallFilesBenchmark.schemaExample4                              avgt    5     0.894 ±   0.002  ms/op
ToRdfSmallFilesBenchmark.schemaExtBib                                avgt    5     1.075 ±   0.003  ms/op
ToRdfSmallFilesBenchmark.schemaExtHealthLifeSci                      avgt    5     2.923 ±   0.025  ms/op
ToRdfSmallFilesBenchmark.schemaExtMeta                               avgt    5     0.931 ±   0.002  ms/op

# Benchmark: no.hasmac.jsonld.benchmark.OOMBenchmark.datagovbeDcatToRdf

# Run progress: 0.59% complete, ETA 1 days, 06:38:39
# Fork: 1 of 1
Iteration   1: 1798.994 ms/op
Iteration   2: 1029.074 ms/op
Iteration   3: 920.315 ms/op
Iteration   4: 880.087 ms/op
Iteration   5: 843.231 ms/op
Iteration   6: 810.543 ms/op
Iteration   7: 801.672 ms/op
Iteration   8: 811.611 ms/op
Iteration   9: Terminating due to java.lang.OutOfMemoryError: Java heap space

ToRdfLargeFilesBenchmark.datagovbeDcat is about 25% faster, but some of the ToRdfSmallFilesBenchmarks are slower.

hmottestad · 2023-10-30T11:46:57Z

There are also the risk of dramatically making some other use cases slower because of the use of Object2ObjectArrayMap which scaled terribly, but is much more efficient when there are only 1 or 2 items.

hmottestad · 2023-12-01T09:07:14Z

To continue with this branch we would need to create a hybrid map implementation that would start off using the Object2ObjectArrayMap and swap it out for an more scalable map if more than 2-3 items are inserted.

hmottestad added 4 commits October 29, 2023 21:58

decent performance improvement

2fe82bd

some more performance improvement

1cfd996

more performance improvements

c470abc

more performance improvements

4bf417f

hmottestad marked this pull request as ready for review October 30, 2023 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing out a different HashSet implementation #5

Testing out a different HashSet implementation #5

hmottestad commented Oct 30, 2023 •

edited

Loading

hmottestad commented Oct 30, 2023

hmottestad commented Dec 1, 2023

Testing out a different HashSet implementation #5

Are you sure you want to change the base?

Testing out a different HashSet implementation #5

Conversation

hmottestad commented Oct 30, 2023 • edited Loading

Before

After

hmottestad commented Oct 30, 2023

hmottestad commented Dec 1, 2023

hmottestad commented Oct 30, 2023 •

edited

Loading