Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test that cuInit is not called when RAPIDS_NO_INITIALIZE is set #12545

Merged

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jan 13, 2023

Description

An alternate approach to that tried in #12361, here we just script GDB and check if we hit a breakpoint in cuInit. When RAPIDS_NO_INITIALIZE is set in the environment, merely importing cudf should not call into the CUDA runtime/driver (i.e. no cuInit should be called). Conversely, to check that we are scripting GDB properly, when we create a cudf object, we definitely should hit cuInit.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@wence- wence- requested review from a team as code owners January 13, 2023 10:56
@wence- wence- requested review from shwina and charlesbluca January 13, 2023 10:56
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jan 13, 2023
@wence- wence- added tech debt non-breaking Non-breaking change labels Jan 13, 2023
@wence- wence- force-pushed the wence/feature/test-no-cuInit-via-gdb branch from c37b961 to de5bfce Compare January 13, 2023 10:58
@wence-
Copy link
Contributor Author

wence- commented Jan 13, 2023

This is an alternate approach to checking that we don't call cuInit that doesn't need to hook into dlsym (like #12361). I use GDB to find a breakpoint rather than running nvprof because nvprof is not supported on devices with compute capability 8 and greater. In those circumstances one is supposed to use nsys, but I can't make that work right, in that both with and without RAPIDS_NO_INITIALIZE in the environment, I always see cuInit in the profiling stack (even though under GDB I do not hit a breakpoint if RAPIDS_NO_INITIALIZE is set):

$ nsys profile --stats=true --trace=cuda python -c "import cudf;"
Generating '/tmp/nsys-report-fedd.qdstrm'
[1/6] [========================100%] report1.nsys-rep
[2/6] [========================100%] report1.sqlite
[3/6] Executing 'cudaapisum' stats report

 Time (%)  Total Time (ns)  Num Calls  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)           Name         
 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------
     97.9           158717        768     206.7     169.0       114      2233        126.8  cuGetProcAddress      
      1.9             3151          2    1575.5    1575.5      1528      1623         67.2  cuInit                
      0.2              327          2     163.5     163.5       145       182         26.2  cuModuleGetLoadingMode

versus:

$ RAPIDS_NO_INITIALIZE=1 nsys profile --stats=true --trace=cuda python -c "import cudf;"
Generating '/tmp/nsys-report-0f63.qdstrm'
[1/6] [========================100%] report2.nsys-rep
[2/6] [========================100%] report2.sqlite
[3/6] Executing 'cudaapisum' stats report

 Time (%)  Total Time (ns)  Num Calls  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)           Name         
 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------
     98.3            75237        384     195.9     157.0       107      1771        116.5  cuGetProcAddress      
      1.5             1138          1    1138.0    1138.0      1138      1138          0.0  cuInit                
      0.2              181          1     181.0     181.0       181       181          0.0  cuModuleGetLoadingMode

@wence- wence- force-pushed the wence/feature/test-no-cuInit-via-gdb branch from de5bfce to ff4f5f5 Compare January 13, 2023 11:04
@github-actions github-actions bot added the conda label Jan 13, 2023
An alternate approach to that tried in rapidsai#12361, here we just script GDB
and check if we hit a breakpoint in cuInit. When RAPIDS_NO_INITIALIZE
is set in the environment, merely importing cudf should not call into
the CUDA runtime/driver (i.e. no cuInit should be called). Conversely,
to check that we are scripting GDB properly, when we create a cudf
object, we definitely _should_ hit cuInit.
@wence- wence- force-pushed the wence/feature/test-no-cuInit-via-gdb branch from ff4f5f5 to 9da854e Compare January 13, 2023 11:06
@wence- wence- added the improvement Improvement / enhancement to an existing function label Jan 13, 2023
@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Base: 86.58% // Head: 85.70% // Decreases project coverage by -0.88% ⚠️

Coverage data is based on head (34a6b3a) compared to base (b6dccb3).
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-23.02   #12545      +/-   ##
================================================
- Coverage         86.58%   85.70%   -0.88%     
================================================
  Files               155      155              
  Lines             24368    24865     +497     
================================================
+ Hits              21098    21311     +213     
- Misses             3270     3554     +284     
Impacted Files Coverage Δ
python/cudf/cudf/_version.py 1.41% <0.00%> (-98.59%) ⬇️
python/cudf/cudf/core/buffer/spill_manager.py 72.50% <0.00%> (-7.50%) ⬇️
python/cudf/cudf/core/buffer/spillable_buffer.py 91.07% <0.00%> (-1.78%) ⬇️
python/cudf/cudf/utils/dtypes.py 77.85% <0.00%> (-1.61%) ⬇️
python/cudf/cudf/options.py 86.11% <0.00%> (-1.59%) ⬇️
python/cudf/cudf/core/single_column_frame.py 94.30% <0.00%> (-1.27%) ⬇️
...ython/custreamz/custreamz/tests/test_dataframes.py 98.38% <0.00%> (-1.01%) ⬇️
python/dask_cudf/dask_cudf/io/csv.py 96.34% <0.00%> (-1.00%) ⬇️
python/dask_cudf/dask_cudf/io/parquet.py 91.81% <0.00%> (-0.59%) ⬇️
python/cudf/cudf/core/multiindex.py 91.66% <0.00%> (-0.51%) ⬇️
... and 43 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 to this approach

@davidwendt
Copy link
Contributor

I like this approach too.
I wonder if you could use cuda-gdb instead of gdb only because it may already be installed with the CUDA toolkit and therefore would not be required to be added to the yamls.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better!

Minor nit: would it be simpler to write out the gdb command into a bash script (could even write to a tempfile within the Python script itself if we really wanted to be fancy, but I'm mostly just thinking of saving a bash script in the repo to be executed by the test)? It could accept a single argument, the Python command to execute, and then return an appropriate exit code. The subprocess lines aren't terribly readable.

@github-actions github-actions bot removed the conda label Jan 17, 2023
@wence-
Copy link
Contributor Author

wence- commented Jan 17, 2023

I like this approach too. I wonder if you could use cuda-gdb instead of gdb only because it may already be installed with the CUDA toolkit and therefore would not be required to be added to the yamls.

Thanks, done!

@wence-
Copy link
Contributor Author

wence- commented Jan 17, 2023

This is much better!

Minor nit: would it be simpler to write out the gdb command into a bash script (could even write to a tempfile within the Python script itself if we really wanted to be fancy, but I'm mostly just thinking of saving a bash script in the repo to be executed by the test)? It could accept a single argument, the Python command to execute, and then return an appropriate exit code. The subprocess lines aren't terribly readable.

gdb -x accepts commands from stdin, so I've just encoded them as input to the subprocess call instead of creating a temporary file.

@wence-
Copy link
Contributor Author

wence- commented Jan 17, 2023

Skipping test on arm64 for now, since cuda-gdb is non-functional.

@ajschmidt8 ajschmidt8 removed the request for review from a team January 18, 2023 22:09
@ajschmidt8
Copy link
Member

Removing ops-codeowners from the required reviews since it doesn't seem there are any file changes that we're responsible for. Feel free to add us back if necessary.

@wence-
Copy link
Contributor Author

wence- commented Jan 19, 2023

Removing ops-codeowners from the required reviews since it doesn't seem there are any file changes that we're responsible for. Feel free to add us back if necessary.

Originally I modified the dependencies.yaml but it turned out not to be necessary.

@wence-
Copy link
Contributor Author

wence- commented Jan 19, 2023

/merge

@rapids-bot rapids-bot bot merged commit 7b71c3d into rapidsai:branch-23.02 Jan 20, 2023
@wence- wence- deleted the wence/feature/test-no-cuInit-via-gdb branch January 23, 2023 11:07
wence- added a commit to wence-/cudf that referenced this pull request Jan 30, 2023
Since rapidsai#12545, we use cuda-gdb for scripting (since that is already
installed in the nvidia/cuda:-devel docker images) to check that
RAPIDS_NO_INITIALIZE ensured that cuInit is not called on import.

Unfortunately, the official cuda-gdb-11-2 package for Debian-based
systems does not correctly advertise all its dependencies (we need to
manually install libtinfo5 and libncursesw5). Consequently cuda-gdb
does not work if the base image rapids builds against is for CUDA
11.2.

To workaround this, just ask conda to install gdb for us, since we
don't actually need any cuda-gdb features.
wence- added a commit to wence-/cudf that referenced this pull request Jan 30, 2023
Since rapidsai#12545, we use cuda-gdb for scripting (since that is already
installed in the nvidia/cuda:-devel docker images) to check that
RAPIDS_NO_INITIALIZE ensured that cuInit is not called on import.

Unfortunately, the official cuda-gdb-11-2 package for Debian-based
systems does not correctly advertise all its dependencies (we need to
manually install libtinfo5 and libncursesw5). Consequently cuda-gdb
does not work if the base image rapids builds against is for CUDA
11.2.

To workaround this, build the cuda-gdb command as a fixture that is
appropriately marked in the cases where it is either not installed, or
else not working due to broken dependencies.
rapids-bot bot pushed a commit that referenced this pull request Jan 31, 2023
)

Since #12545, we use cuda-gdb for scripting (since that is already installed in the nvidia/cuda:-devel docker images) to check that RAPIDS_NO_INITIALIZE ensured that cuInit is not called on import.

Unfortunately, the official cuda-gdb-11-2 package for Debian-based systems does not correctly advertise all its dependencies (we need to manually install libtinfo5 and libncursesw5). Consequently cuda-gdb does not work if the base image rapids builds against is for CUDA 11.2.

To workaround this, build the cuda-gdb command as a fixture that is appropriately marked in the cases where it is either not installed, or else not working due to broken dependencies.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Ashwin Srinath (https://github.com/shwina)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)

URL: #12644
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants