Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Import cudf on non-CUDA enabled machine #3661

Closed
mrocklin opened this issue Dec 21, 2019 · 16 comments
Closed

[FEA] Import cudf on non-CUDA enabled machine #3661

mrocklin opened this issue Dec 21, 2019 · 16 comments
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@mrocklin
Copy link
Collaborator

I would like to be able to import cudf and refer to cudf methods and classes on a machine that does not have a GPU.

This is particularly useful when using cudf with Dask. While my local machine may not have an attached GPU, my Dask workers may. When using Dask with cudf I need to be able to refer to cudf functions so that I can place them into a task graph, but don't actually need to run them locally.

This fails today with an import error saying that we can't import libcuda.so.1. Is it feasible to make importing cudf robust to these errors?

@mrocklin mrocklin added feature request New feature or request Needs Triage Need team to review and classify Python Affects Python cuDF API. labels Dec 21, 2019
@kkraus14 kkraus14 removed the Needs Triage Need team to review and classify label Jan 16, 2020
@kkraus14
Copy link
Collaborator

This will be extremely challenging as cuDF Python uses Cython which loads libcudf.so which eventually depends on libcuda.so and libcudart.so.

That being said, there's nothing preventing you from installing CUDA drivers and the CUDA toolkit on a machine without GPUs which would make all of this work. I imagine this is pretty cumbersome, but making everything work without loading libcuda.so or libcudart.so would be extremely challenging.

I'm going to close this as out of scope and my thoughts are that it would be significantly easier to decouple the dask meta object type from the partition type than to not load the CUDA libraries in cuDF.

@mrocklin
Copy link
Collaborator Author

That being said, there's nothing preventing you from installing CUDA drivers and the CUDA toolkit on a machine without GPUs which would make all of this work

I can't do this from userspace though, right? This requires system administrator privileges?

To be clear, my motivation here is to allow users on non-GPU devices (like a MacBook) to drive RAPIDS work on the cloud.

@kkraus14
Copy link
Collaborator

I can't do this from userspace though, right? This requires system administrator privileges?

The cuda toolkit is userspace. A libcuda stub could be userspace but someone would need to build it / distribute it.

To be clear, my motivation here is to allow users on non-GPU devices (like a MacBook) to drive RAPIDS work on the cloud.

I understand that, but from my perspective a much easier means to that end is to decouple the Dask meta objects to allow them to use Pandas with cudf partitions. You're going to have the same problems with CuPy, PyTorch, etc. when you go to try to use their GPU objects.

@mrocklin
Copy link
Collaborator Author

I'm not sure that decoupling meta will solve this problem. For example if we want to call cudf.read_csv we're going to need to refer to that function somehow.

The meta problem doesn't bother me that much. We can lie to dask dataframe, give it a Pandas dataframe as a meta object even when the underlying data will be cudf dataframes, and I suspect that things will be OK most of the time.

@mrocklin
Copy link
Collaborator Author

I'm also not saying that we need to be able to import cudf. I'm just trying to figure out what the best way is to do this.

@kkraus14
Copy link
Collaborator

I'm also not saying that we need to be able to import cudf. I'm just trying to figure out what the best way is to do this.

Yea I understand the motivation and agree that it's the right user experience. Given we're going to supporting larger than the Pandas API in the future I don't think we can rely purely on Pandas to do meta calculations for us unfortunately.

From my perspective, the easiest path forward is for us to somehow break our dependency on the driver library (libcuda.so) which is MUCH harder to get installed on a system without GPUs, and then expect the user to install the CUDA toolkit which is userspace (there's conda packages). Then for zero sized cudf objects there shouldn't be any kernel launches so we could get away with not needing a GPU present for _meta objects, but _meta_nonempty is a big problem.

@kkraus14 kkraus14 reopened this Jan 19, 2020
@jakirkham
Copy link
Member

Another way of solving this is to run some local code remotely ( dask/distributed#4003 ).

@shwina
Copy link
Contributor

shwina commented Apr 13, 2022

This should be fixable today since nothing links to libcuda.so AFAIK. We do use CUDA Python during import cudf which dynamically loads libcuda.so, but it throws a RuntimeError when that fails that we can catch and convert into a warning.

@kkraus14
Copy link
Collaborator

This should be fixable today since nothing links to libcuda.so AFAIK.

libarrow_cuda links to libcuda.so and we use that in the IPC interop code in libcudf.

@jakirkham
Copy link
Member

Do we load that while running import cudf? Or does that happen later (like when accessing a specific module)? If it does happen during import cudf, could we defer that?

@shwina
Copy link
Contributor

shwina commented Apr 13, 2022

I think it's only the .so yielded by cudf/_lib/gpuarrow.pyx that actually needs to link to arrow_cuda. AFAICT, libcudf itself doesn't link to arrow_cuda.

We could defer import of the gpuarrow module perhaps?

@kkraus14
Copy link
Collaborator

That sounds like a bug in our CMake then because we're definitely using the arrow::cuda namespace in libcudf: https://github.com/rapidsai/cudf/blob/branch-22.06/cpp/src/comms/ipc/ipc.cpp#L4

@shwina
Copy link
Contributor

shwina commented Apr 13, 2022

Running ldd on libcudf.so, I see libarrow.so but not libarrow_cuda.so -- am I missing something? Do you see the same?

Edit: wonder if it's getting llinked statically for me somewhere.

@jakirkham
Copy link
Member

It may not get linked if nothing is used from it

@vyasr
Copy link
Contributor

vyasr commented Oct 17, 2022

This should be resolved after #11287. @shwina please reopen if there's additional work to be done here that I missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

5 participants