You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 12, 2024. It is now read-only.
When attempting to run parallel batch recommendation on CUDA-enabled systems, it fails with a CUDA initialization error in the worker process:
Traceback (most recent call last):
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/concurrent/futures/process.py", line 246, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/concurrent/futures/process.py", line 205, in _process_chunk
return [fn(*args) for args in chunk]
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/concurrent/futures/process.py", line 205, in <listcomp>
return [fn(*args) for args in chunk]
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/site-packages/lenskit/util/parallel.py", line 130, in _mp_invoke_worker
return __work_func(model, *args)
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/site-packages/lenskit/batch/_recommend.py", line 19, in _recommend_user
res = algo.recommend(user, n, candidates)
File "/home/MICHAELEKSTRAND/LensKit/lenskit-implicit/lenskit_implicit/implicit.py", line 69, in recommend
recs, scores = self.delegate.recommend(uid, matrix, N=i_n)
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/site-packages/implicit/gpu/matrix_factorization_base.py", line 87, in recommend
ids, scores = self.knn.topk(
File "/home/MICHAELEKSTRAND/mambaforge/envs/lkimp/lib/python3.10/site-packages/implicit/gpu/matrix_factorization_base.py", line 122, in knn
self._knn = implicit.gpu.KnnQuery()
File "_cuda.pyx", line 47, in implicit.gpu._cuda.KnnQuery.__cinit__
RuntimeError: cublas error: CUBLAS_STATUS_NOT_INITIALIZED (/tmp/pip-req-build-b0ax806a/implicit/gpu/knn.cu:87)
@benfred Already doing that :) (although through a more indirect method — process pools are set up with a custom context that inherits from multiprocessing's SpawnContext).
I'm going to go ahead and cut an initial release without this, so I can get this released and have (rough) 0.14 parity before working on more substantial LensKit 0.15 changes.
When attempting to run parallel batch recommendation on CUDA-enabled systems, it fails with a CUDA initialization error in the worker process:
Tagging @benfred in case he has any insight here.
Things to test
The text was updated successfully, but these errors were encountered: