Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow startup: can this be mitigated? #211

Open
rschwiebert opened this issue Oct 10, 2024 · 3 comments
Open

Very slow startup: can this be mitigated? #211

rschwiebert opened this issue Oct 10, 2024 · 3 comments

Comments

@rschwiebert
Copy link

rschwiebert commented Oct 10, 2024

After observing the first computation done by multik's arrays seemed slow, I was surprised that the following toy script

import org.jetbrains.kotlinx.multik.api.linalg.dot
import org.jetbrains.kotlinx.multik.api.mk
import org.jetbrains.kotlinx.multik.api.ndarray
import kotlin.random.Random


fun main() {
    repeat(10) {
        val a = mk.ndarray(listOf(listOf(Random.nextDouble())))
        val b = mk.ndarray(listOf(listOf(Random.nextDouble())))
        val start = System.nanoTime()
        a dot b
        println(System.nanoTime() - start)
    }
}

reliably resulted in results on this order when run in intellij:

12440996083
58417
18541
14458
15042
16833
14041
15042
36708
67042

Before I knew the magnitude of the difference between the first item and the rest I was writing it off as some sort of cacheing effect. But now it really seems like it's some sort of lazy initialization or JIT compiling or something that has nothing to do with the computation at hand.

Can this be mitigated? I tried with both 0.2.2 and 0.2.3 with the same results. My laptop is an M1 chip mac, but I'm ultimately interested in running it with ubuntu distros.

Thanks in advance for your advice.

New info:
Running the same test on an ubuntu instance, it does seem like the delay is magnitudes smaller there (only 0.2s). Maybe this is something to do with my laptop architecture or something.

@devcrocod
Copy link
Collaborator

Hi, thank you report. It's really interesting stuff

Your assessment is correct: the first execution is indeed longer than later ones. The numbers and the difference may not be entirely accurate, as many factors can influence this.

There are mainly two reasons why the first computation is slower. And yeah, you're right, it's a lazy initialization in the following way:

  1. The implementation engine (kotlin or native) is located. This takes little computational time, but there are some overheads associated with it.
  2. In the case of Kotlin, the computation proceeds immediately afterward. But if you use native, the native library is first loaded (System.load), which takes the majority of the time during this call. After that, a jni call occurs, which incurs slightly more overhead for the first invocation.

It is not possible to eliminate or speed up the native library loading process. This behavior is inherent to the JVM itself. You can use a pure kotlin implementation — multik-kotlin, without a native library. However, the computations will be slower in this case.

You can also pre-load the native library in advance by calling mk.math or mk.linalg before your main computations.

If it is critical for you to measure performance, including the first computation, I recommend using jmh or kotlin-benchmarks. This will more accurately measure the specific computation and reduce the impact of external factors, such as other processes

PS: Kotlin has a convenient function measureTime - https://kotlinlang.org/docs/time-measurement.html#measure-time

@hakanai
Copy link

hakanai commented Jan 2, 2025

You can use a pure kotlin implementation — multik-kotlin, without a native library. However, the computations will be slower in this case.

Wouldn't be so sure about that.

* DEFAULT (DefaultEngineType - org.jetbrains.kotlinx.multik.default.DefaultEngine)
  Render time: 1m 25.686367600s

* KOTLIN (KEEngineType - org.jetbrains.kotlinx.multik.kotlin.KEEngine)
  Render time: 1m 10.005497300s

* NATIVE (NativeEngineType - org.jetbrains.kotlinx.multik.openblas.JvmNativeEngine)
  Render time: 1m 26.825962600s

@rschwiebert
Copy link
Author

rschwiebert commented Jan 2, 2025

Wouldn't be so sure about that.

* DEFAULT (DefaultEngineType - org.jetbrains.kotlinx.multik.default.DefaultEngine)
  Render time: 1m 25.686367600s

* KOTLIN (KEEngineType - org.jetbrains.kotlinx.multik.kotlin.KEEngine)
  Render time: 1m 10.005497300s

* NATIVE (NativeEngineType - org.jetbrains.kotlinx.multik.openblas.JvmNativeEngine)
  Render time: 1m 26.825962600s

@hakanai I think you must have accidentally included the library load time in whatever timing you did. I bet if you do a "warm up" computation (which will force the libraries to load) then try your computation with timing, the default+native numbers will drop below the kotlin one.

When I tested that way, the prediction above was borne out as described. I was definitely seeing 20s or so of load time for native libraries, and and after excluding that they outperformed the kotlin native code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants