Skip to content

Conversation

@mrk-andreev
Copy link
Contributor

What changes were proposed in this pull request?

Add benchmarks for all codepaths of initCap, namely, paths that call:

  • execBinaryICU
  • execBinary
  • execLowercase
  • execICU

Why are the changes needed?

Requested by jira ticket SPARK-49490.

Does this PR introduce any user-facing change?

No

How was this patch tested?

The benchmark was tested locally by performing a manual run.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Oct 16, 2024
@mrk-andreev
Copy link
Contributor Author

Results of local run InitCapBenchmark-local.txt

Sample

Running benchmark: InitCap evaluation [wc=1000, wl=16, capitalized=false]
  Running case: execICU
  Stopped after 8978 iterations, 2000 ms
  Running case: execBinaryICU
  Stopped after 6235 iterations, 2000 ms
  Running case: execBinary
  Stopped after 28374 iterations, 2000 ms
  Running case: execLowercase
  Stopped after 8839 iterations, 2000 ms

OpenJDK 64-Bit Server VM 17.0.2+8-86 on Linux 5.15.0-122-generic
Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
InitCap evaluation [wc=1000, wl=16, capitalized=false]:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------
execICU                                                             0              0           0     432768.3           0.0       1.0X
execBinaryICU                                                       0              0           0     285450.1           0.0       0.7X
execBinary                                                          0              0           0    1494256.8           0.0       3.5X
execLowercase                                                       0              0           0     415082.4           0.0       1.0X

Open questions

  1. Should we place the benchmark code in the same package, 'unsafe,' or at the 'SQL level'? If it's in 'unsafe,' should we extract the shared code for benchmarks into a shared library?
  2. The benchmark output expects each measurement to be at least 1 ms, but this isn't the case here. Should we align the rounding to the first non-zero digit after the decimal point?
  3. How detailed do we expect the benchmarks to be? Do we want different axes of variation, or should we stick to defaults like parameters?

@HyukjinKwon
Copy link
Member

Can we include the benchmark result files too? See also "Testing with GitHub Actions workflow" at https://spark.apache.org/developer-tools.html

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we place the benchmark code in the same package, 'unsafe,' or at the 'SQL level'

Let's place the backmark at the SQL level so far.

@mrk-andreev
Copy link
Contributor Author

Let's place the backmark at the SQL level so far.

Done

Can we include the benchmark result files too?

Done

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you generate benchmark results for jdk 21 too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's bump number of iterations to see seconds in Best/Avg Time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted the word count for my Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, but encountered issues with local evaluation. This led to a remote evaluation on an Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, where the performance was noticeably less impressive.

@MaxGekk
Copy link
Member

MaxGekk commented Oct 22, 2024

@uros-db @mihailom-db @viktorluc-db Could you review this PR, please.

Copy link
Contributor

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have benchmarks for collations

please see: CollationBenchmark

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, fix indentation here, see https://github.com/databricks/scala-style-guide?tab=readme-ov-file#spacing-and-indentation or it is better to place the parameters on the same line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you benchmark more collations, see

Seq("UTF8_BINARY", "UTF8_LCASE", "UNICODE", "UNICODE_CI")

Copy link
Contributor Author

@mrk-andreev mrk-andreev Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended with

for (collationName <- List("he_ISR", "UNICODE", "UNICODE_CI")) {
    val collationId = CollationFactory.collationNameToId(collationName)
    assert(CollationFactory.fetchCollation(collationId).collator != null)
    val caseName = s"execICU[collationName=${collationName}]"
    benchmark.addCase(caseName)(_ => InitCap.execICU(text, collationId))
}

The primary requirement for collationId in InitCap.execICU is that CollationFactory.fetchCollation(collationId).collator must not be null; otherwise, the function will throw an NPE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

  • InitCapBenchmark-results.txt
  • InitCapBenchmark-jdk21-results.txt

@mrk-andreev mrk-andreev force-pushed the SPARK-49490 branch 2 times, most recently from 31632c9 to 6b1d79e Compare November 3, 2024 16:44
@MaxGekk
Copy link
Member

MaxGekk commented Nov 7, 2024

@mrk-andreev Could you intergrate your benchmark into CollationBenchmark, please, as @uros-db pointed out #48501 (review). Otherwise we might forget to re-run your benchmark while benchmarking collation related code.

@mrk-andreev mrk-andreev force-pushed the SPARK-49490 branch 2 times, most recently from 5bf2fba to 6c336c2 Compare November 12, 2024 21:20
@mrk-andreev
Copy link
Contributor Author

@mrk-andreev Could you intergrate your benchmark into CollationBenchmark, please, as @uros-db pointed out #48501 (review). Otherwise we might forget to re-run your benchmark while benchmarking collation related code.

@MaxGekk , done.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also cc @stevomitric who is working on the same benchmarks in #48804

Comment on lines 196 to 198
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix indentations here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. My bad

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you increase the number of iterations to have non-zero StdDev, and make the benchmark more reliable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Fixed.

I re-evaluated just initCap related benchmarks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we can see, this benchmark is sensitive to CPU clock speed. During my latest measurements on an Intel(R) Xeon(R) Platinum 8252C CPU @ 3.80GHz (AWS m5zn.xlarge), the stdev for some measurements - along with others - dropped to zero or one.

I suggest adding more decimal places to the results in a separate PR.

Copy link
Contributor

@stevomitric stevomitric Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently merged a fix for these benchmarks here #48804, so this regression is outdated.

Could you please sync with master and re-run the benchmarks to not commit the outdated results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@mrk-andreev
Copy link
Contributor Author

cc: @MaxGekk

Related work

This is not related to my code changes but rather to the benchmarks we are modifying. It might be worth starting a separate thread in the dev mailing list or creating an additional ticket in Jira, which I would be happy to handle.

Blackhole

I would like to point out that the current implementation of org.apache.spark.benchmark.Benchmark::addCase does not use any form of Blackhole (Blackhole in JMH), which could lead to dead-code elimination. However, I have not observed this issue in the existing tests. This is likely due to the complexity and side effects of the code being benchmarked, which prevents such elimination.

Would it be a good idea to consider adding this as a feature in the future?

Context

org.apache.spark.benchmark.Benchmark::addCase

  def addCase(name: String, numIters: Int = 0)(f: Int => Unit): Unit = {
    addTimerCase(name, numIters) { timer =>
      timer.startTiming()
      f(timer.iteration)
      timer.stopTiming()
    }
  }

Async-profiler

I suggest adding Async Profiler, a low-overhead sampling profiler, to all benchmark runs. This will help us identify the causes of performance degradation.

Would it also be worth considering adding this as a feature in the future?

@mrk-andreev
Copy link
Contributor Author

Hi @MaxGekk, @stevomitric,

Does this PR need any additional changes? Are there any blockers we should address? Let me know how I can help to move it forward!

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except of a few minor comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the enclosing braces are redundant:

Suggested change
s"collation unit benchmarks - initCap using impl ${implName}",
s"collation unit benchmarks - initCap using $implName",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
benchmark.addCase(s"$collationType") { _ =>
benchmark.addCase(collationType) { _ =>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a collation id, and types should begin from an upper case letter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Fixed

@MaxGekk
Copy link
Member

MaxGekk commented Nov 21, 2024

+1, LGTM. Merging to master.
Thank you, @mrk-andreev and @stevomitric @uros-db for review.

@MaxGekk MaxGekk closed this in 95faa02 Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants