[FEA] Import cudf on non-CUDA enabled machine #3661

mrocklin · 2019-12-21T17:55:21Z

I would like to be able to import cudf and refer to cudf methods and classes on a machine that does not have a GPU.

This is particularly useful when using cudf with Dask. While my local machine may not have an attached GPU, my Dask workers may. When using Dask with cudf I need to be able to refer to cudf functions so that I can place them into a task graph, but don't actually need to run them locally.

This fails today with an import error saying that we can't import libcuda.so.1. Is it feasible to make importing cudf robust to these errors?

The text was updated successfully, but these errors were encountered:

kkraus14 · 2020-01-16T08:30:57Z

This will be extremely challenging as cuDF Python uses Cython which loads libcudf.so which eventually depends on libcuda.so and libcudart.so.

That being said, there's nothing preventing you from installing CUDA drivers and the CUDA toolkit on a machine without GPUs which would make all of this work. I imagine this is pretty cumbersome, but making everything work without loading libcuda.so or libcudart.so would be extremely challenging.

I'm going to close this as out of scope and my thoughts are that it would be significantly easier to decouple the dask meta object type from the partition type than to not load the CUDA libraries in cuDF.

mrocklin · 2020-01-16T18:55:08Z

That being said, there's nothing preventing you from installing CUDA drivers and the CUDA toolkit on a machine without GPUs which would make all of this work

I can't do this from userspace though, right? This requires system administrator privileges?

To be clear, my motivation here is to allow users on non-GPU devices (like a MacBook) to drive RAPIDS work on the cloud.

kkraus14 · 2020-01-16T18:59:47Z

I can't do this from userspace though, right? This requires system administrator privileges?

The cuda toolkit is userspace. A libcuda stub could be userspace but someone would need to build it / distribute it.

To be clear, my motivation here is to allow users on non-GPU devices (like a MacBook) to drive RAPIDS work on the cloud.

I understand that, but from my perspective a much easier means to that end is to decouple the Dask meta objects to allow them to use Pandas with cudf partitions. You're going to have the same problems with CuPy, PyTorch, etc. when you go to try to use their GPU objects.

mrocklin · 2020-01-18T17:05:43Z

I'm not sure that decoupling meta will solve this problem. For example if we want to call cudf.read_csv we're going to need to refer to that function somehow.

The meta problem doesn't bother me that much. We can lie to dask dataframe, give it a Pandas dataframe as a meta object even when the underlying data will be cudf dataframes, and I suspect that things will be OK most of the time.

mrocklin · 2020-01-18T17:06:14Z

I'm also not saying that we need to be able to import cudf. I'm just trying to figure out what the best way is to do this.

kkraus14 · 2020-01-19T01:02:34Z

I'm also not saying that we need to be able to import cudf. I'm just trying to figure out what the best way is to do this.

Yea I understand the motivation and agree that it's the right user experience. Given we're going to supporting larger than the Pandas API in the future I don't think we can rely purely on Pandas to do meta calculations for us unfortunately.

From my perspective, the easiest path forward is for us to somehow break our dependency on the driver library (libcuda.so) which is MUCH harder to get installed on a system without GPUs, and then expect the user to install the CUDA toolkit which is userspace (there's conda packages). Then for zero sized cudf objects there shouldn't be any kernel launches so we could get away with not needing a GPU present for _meta objects, but _meta_nonempty is a big problem.

jakirkham · 2020-07-31T17:12:13Z

Another way of solving this is to run some local code remotely ( dask/distributed#4003 ).

shwina · 2022-04-13T16:18:15Z

This should be fixable today since nothing links to libcuda.so AFAIK. We do use CUDA Python during import cudf which dynamically loads libcuda.so, but it throws a RuntimeError when that fails that we can catch and convert into a warning.

kkraus14 · 2022-04-13T18:58:07Z

This should be fixable today since nothing links to libcuda.so AFAIK.

libarrow_cuda links to libcuda.so and we use that in the IPC interop code in libcudf.

jakirkham · 2022-04-13T19:38:23Z

Do we load that while running import cudf? Or does that happen later (like when accessing a specific module)? If it does happen during import cudf, could we defer that?

shwina · 2022-04-13T19:40:53Z

I think it's only the .so yielded by cudf/_lib/gpuarrow.pyx that actually needs to link to arrow_cuda. AFAICT, libcudf itself doesn't link to arrow_cuda.

We could defer import of the gpuarrow module perhaps?

kkraus14 · 2022-04-13T20:09:57Z

That sounds like a bug in our CMake then because we're definitely using the arrow::cuda namespace in libcudf: https://github.com/rapidsai/cudf/blob/branch-22.06/cpp/src/comms/ipc/ipc.cpp#L4

kkraus14 · 2022-04-13T20:13:53Z

Scratch that, not a cmake bug, we totally link it:

shwina · 2022-04-13T20:18:13Z

Running ldd on libcudf.so, I see libarrow.so but not libarrow_cuda.so -- am I missing something? Do you see the same?

Edit: wonder if it's getting llinked statically for me somewhere.

jakirkham · 2022-04-13T20:31:55Z

It may not get linked if nothing is used from it

vyasr · 2022-10-17T19:53:09Z

This should be resolved after #11287. @shwina please reopen if there's additional work to be done here that I missed.

mrocklin added feature request New feature or request Needs Triage Need team to review and classify Python Affects Python cuDF API. labels Dec 21, 2019

kkraus14 removed the Needs Triage Need team to review and classify label Jan 16, 2020

kkraus14 closed this as completed Jan 16, 2020

kkraus14 reopened this Jan 19, 2020

jrhemstad mentioned this issue Apr 13, 2022

[BUG] unhandled exn for cpu buildbots rapidsai/rmm#1022

Closed

ian-r-rose mentioned this issue May 19, 2022

WIP: Ship graphs from client to scheduler with pickle dask/distributed#6028

Closed

shwina mentioned this issue May 27, 2022

[DISCUSSION] Remove the Arrow CUDA IPC related code from libcudf/cuDF #10994

Closed

vyasr mentioned this issue Jul 12, 2022

Remove Arrow CUDA IPC code #10995

Merged

shwina mentioned this issue Jul 27, 2022

Remove use of CUDA driver API calls from libcudf #11370

Merged

3 tasks

vyasr closed this as completed Oct 17, 2022

jriesen mentioned this issue May 31, 2023

[QST] Is it possible to use RAPIDS via Dask distributed without a local GPU? #13481

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Import cudf on non-CUDA enabled machine #3661

[FEA] Import cudf on non-CUDA enabled machine #3661

mrocklin commented Dec 21, 2019

kkraus14 commented Jan 16, 2020

mrocklin commented Jan 16, 2020

kkraus14 commented Jan 16, 2020

mrocklin commented Jan 18, 2020

mrocklin commented Jan 18, 2020

kkraus14 commented Jan 19, 2020

jakirkham commented Jul 31, 2020

shwina commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

jakirkham commented Apr 13, 2022

shwina commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

shwina commented Apr 13, 2022 •

edited

Loading

jakirkham commented Apr 13, 2022

vyasr commented Oct 17, 2022

[FEA] Import cudf on non-CUDA enabled machine #3661

[FEA] Import cudf on non-CUDA enabled machine #3661

Comments

mrocklin commented Dec 21, 2019

kkraus14 commented Jan 16, 2020

mrocklin commented Jan 16, 2020

kkraus14 commented Jan 16, 2020

mrocklin commented Jan 18, 2020

mrocklin commented Jan 18, 2020

kkraus14 commented Jan 19, 2020

jakirkham commented Jul 31, 2020

shwina commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

jakirkham commented Apr 13, 2022

shwina commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

kkraus14 commented Apr 13, 2022

shwina commented Apr 13, 2022 • edited Loading

jakirkham commented Apr 13, 2022

vyasr commented Oct 17, 2022

shwina commented Apr 13, 2022 •

edited

Loading