Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Update Dockerfile for setup #27

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Nanthini10
Copy link

The previous version of the file used older version of RAPIDS and didn't build properly for me.

Changing it to use RAPIDS image instead of nvidia/cuda image as most of these packages come pre-installed including xgboost and light-gbm.

Copy link
Collaborator

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern here is that we might want the latest dmlc xgb and not the rapids version.

@Nanthini10
Copy link
Author

That's a good point, I can undo the xgboost install step. Would this require installing from source, or the latest stable version (using pip) is good enough?

@RAMitchell
Copy link
Collaborator

I think it should be source.

@Nanthini10
Copy link
Author

@RAMitchell I've updated the dockerfile but the xgboost install isn't working with the following error:

CMake Error at plugin/CMakeLists.txt:8 (message):
  Could not locate RMM library

I can run the same command within the container, but the Dockerfile for some reason fails. Do you know what could be the reason for this?

@RAMitchell
Copy link
Collaborator

I don't I'm afraid. Maybe @hcho3 has an idea?

@hcho3
Copy link
Collaborator

hcho3 commented Aug 12, 2021

RMM_ROOT=/opt/conda should be revised to point to the correct location of RMM library in the Conda environment.

@Nanthini10
Copy link
Author

@hcho3 is it /rapids/rmm or elsewhere?

However, this worked when I ran it inside rapidsai/rapidsai-dev:21.06-cuda11.0-devel-ubuntu18.04-py3.8

git config --global http.sslVerify false && \
    git clone --recursive https://github.com/dmlc/xgboost /opt/xgboost && \
    cd /opt/xgboost && \
    mkdir build && \
    cd build && \
    RMM_ROOT=/opt/conda cmake .. \
        -DUSE_CUDA=ON \
        -DUSE_NCCL=ON \
        -DPLUGIN_RMM=ON && \
    make -j4 && \
    cd ../python-package && \
    pip uninstall -y xgboost && \
    python setup.py install

@hcho3
Copy link
Collaborator

hcho3 commented Aug 12, 2021

Can you run echo $CONDA_PREFIX inside the container? That's where RMM is installed.

@Nanthini10
Copy link
Author

Thanks! That helped /opt/conda/envs/rapids

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants