Java APIs to fetch CUDA runtime info [skip ci] #8465

sperlingxx · 2021-06-09T09:52:52Z

Closes #8084 #6363

Signed-off-by: sperlingxx lovedreamf@gmail.com

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java

java/src/test/java/ai/rapids/cudf/RmmTest.java

java/src/main/java/ai/rapids/cudf/Cuda.java

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java

revans2 · 2021-06-09T14:12:00Z

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java

+public enum CudaComputeMode {
+  cudaComputeModeDefault(0),
+  cudaComputeModeExclusive(1),
+  cudaComputeModeProhibited(2),


Thee enums do not match the java naming convention, which is UPPER_CASE_WITH_UNDERSCORES. Also because all of them are under the CudaComputeMode class we don't need to prefix them all with cudaComputeMode

Could you please change them to

public enum CudaComputeMode { DEFAULT(0), EXCLUSIVE(1), PROHIBITED(2), EXCLUSIVE_PROCESS(3); ...

I applied the UPPER_CASE_WITH_UNDERSCORES style.

revans2 · 2021-06-09T14:15:13Z

java/src/main/java/ai/rapids/cudf/Cuda.java

+   */
+  public static CudaComputeMode getComputeMode() {
+    int nativeMode = Cuda.getNativeComputeMode();
+    switch (nativeMode) {


Generally for other enums we put the mapping from native in the enum class itself. That way all of the info is in a single file and is consistent. We also can make the code a lot smaller if we don't worry about performance. Because this is not something that is going to be called in a tight loop where performance matters, I would much rather have smaller code with guaranteed consistency. You can use the BinaryOp.fromNative as an example.

static BinaryOp fromNative(int nativeId) { for (BinaryOp type : OPS) { if (type.nativeId == nativeId) { return type; } } throw new IllegalArgumentException("Could not translate " + nativeId + " into a BinaryOp"); }

I changed the way of native mapping. For now, it follows above style.

revans2 · 2021-06-09T14:24:07Z

java/src/test/java/ai/rapids/cudf/RmmTest.java

@@ -399,6 +399,14 @@ public void testPoolLimitNonPoolMode() {
        () -> Rmm.initialize(RmmAllocationMode.CUDA_DEFAULT, false, 1024, 2048));
  }

+  @Test
+  public void testGetCudaRuntimeInfo() {
+    Rmm.initialize(RmmAllocationMode.POOL, false, 1024);


What happens if we run these APIs without initializing the pool? Because in the common case I suspect that is how they are going to be used.

Also why are the tests a part of RMM. They should have nothing to do with RMM.

I guess this initialization is used to set the GPU device for the later operations, per the current JNI implementation. But even so, moving these tests to Cuda would be better.

And I am wondering whether all the calls in CudaJni need to set the device first.

I moved this case to a newly-created test class CudaTest

And there is no necessary to initialize RMM before running these APIs.

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java

firestarman · 2021-06-10T06:13:20Z

java/src/test/java/ai/rapids/cudf/RmmTest.java

@@ -399,6 +399,14 @@ public void testPoolLimitNonPoolMode() {
        () -> Rmm.initialize(RmmAllocationMode.CUDA_DEFAULT, false, 1024, 2048));
  }

+  @Test
+  public void testGetCudaRuntimeInfo() {
+    Rmm.initialize(RmmAllocationMode.POOL, false, 1024);


I guess this initialization is used to set the GPU device for the later operations, per the current JNI implementation. But even so, moving these tests to Cuda would be better.

And I am wondering whether all the calls in CudaJni need to set the device first.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

firestarman · 2021-06-11T01:24:31Z

java/src/main/java/ai/rapids/cudf/CudaComputeMode.java

+  /**
+   * Compute-exclusive-thread mode
+   * Only one thread in one process will be able to use cudaSetDevice() with this device.
+   */


This mode is deprecated. Better to add the same comment here.

(base) liangcail@liangcail-ubuntu18:~/work/projects/on_github/spark-rapids$ nvidia-smi -c 1 Warning: Exclusive_Thread was deprecated! Setting Exclusive_Process instead. Unable to set the compute mode for GPU 00000000:01:00.0: Insufficient Permissions Terminating early due to previous errors.

Added warning message for the deprecation.

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx · 2021-06-11T02:28:25Z

@gpucibot merge

Java APIs to fetch CUDA runtime info

73ece9b

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

sperlingxx requested a review from jlowe June 9, 2021 09:52

sperlingxx requested a review from a team as a code owner June 9, 2021 09:52

sperlingxx requested a review from firestarman June 9, 2021 09:53

github-actions bot added the Java Affects Java cuDF API. label Jun 9, 2021

sperlingxx added feature request New feature or request non-breaking Non-breaking change labels Jun 9, 2021

remove redundant reinterpret_cast

2e63cdc

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

abellina requested changes Jun 9, 2021

View reviewed changes

revans2 reviewed Jun 9, 2021

View reviewed changes

firestarman reviewed Jun 10, 2021

View reviewed changes

sperlingxx added 2 commits June 10, 2021 14:35

refine

a7c6a6c

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

revert RmmTest

de71bdf

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

revans2 approved these changes Jun 10, 2021

View reviewed changes

abellina approved these changes Jun 10, 2021

View reviewed changes

firestarman reviewed Jun 11, 2021

View reviewed changes

add warning on deprecated compute mode

e1d4320

Signed-off-by: sperlingxx <lovedreamf@gmail.com>

rapids-bot bot merged commit d3b440e into rapidsai:branch-21.08 Jun 11, 2021

sperlingxx deleted the jni_cuda_info branch June 11, 2021 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java APIs to fetch CUDA runtime info [skip ci] #8465

Java APIs to fetch CUDA runtime info [skip ci] #8465

sperlingxx commented Jun 9, 2021

revans2 Jun 9, 2021

sperlingxx Jun 10, 2021

revans2 Jun 9, 2021

sperlingxx Jun 10, 2021

revans2 Jun 9, 2021

firestarman Jun 10, 2021

sperlingxx Jun 10, 2021

sperlingxx Jun 10, 2021

firestarman Jun 10, 2021

firestarman Jun 11, 2021

sperlingxx Jun 11, 2021

sperlingxx commented Jun 11, 2021

Java APIs to fetch CUDA runtime info [skip ci] #8465

Java APIs to fetch CUDA runtime info [skip ci] #8465

Conversation

sperlingxx commented Jun 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sperlingxx commented Jun 11, 2021