Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update shuffle documentation for branch-21.06 and UCX 1.10.1 #2475

Merged
merged 6 commits into from
May 26, 2021

Conversation

abellina
Copy link
Collaborator

This updates the documentation for the RAPIDS Shuffle Manager for the change to JUCX 1.11.0, and UCX 1.10.1+.

Closes #2286.

@abellina abellina added shuffle things that impact the shuffle plugin documentation Improvements or additions to documentation labels May 21, 2021
@abellina
Copy link
Collaborator Author

abellina commented May 21, 2021

@petro-rudenko, @yosefe fyi.

jlowe
jlowe previously approved these changes May 21, 2021
@jlowe
Copy link
Member

jlowe commented May 21, 2021

build


3. Fetch and install the UCX package for your OS and CUDA version
[UCX 1.9.0](https://github.com/openucx/ucx/releases/tag/v1.9.0).
UCX versions 1.10.1 requires the user to install `libnuma1`. RDMA packages have extra
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide any instructions on installing libnuma1, or is it obvious enough that people need to an apt-get or yum install? Will this requirement go away in a future version of UCX?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petro-rudenko is right. For example: sudo apt install ./ucx-v1.10.1-ubuntu18.04-mofed5.x-cuda11.2.deb

The following NEW packages will be installed:
  libnuma1 ucx
0 upgraded, 2 newly installed, 0 to remove and 420 not upgraded.

If you used dpkg -i the package is not automatically installed (but a apt install -f later does resolve). I'll remove this from the doc, as the package already specifies this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed our sample dockerfiles to use apt install [ucx-package].deb. I assume the same holds for CentOS, but we currently don't have a Dockerfile example for CentOS. I'll add, but maybe I should split these into dockerfiles rather than have them in the markdown.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok trying a few things for CentOS, I'll add as a follow up issue and stick with Ubuntu (rdma-core is giving me trouble for CentOS).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abellina
Copy link
Collaborator Author

@jlowe tweaked a few things and added a follow up, should be ready for another look.

@abellina
Copy link
Collaborator Author

build

@abellina abellina merged commit 3b718f8 into NVIDIA:branch-21.06 May 26, 2021
@abellina abellina deleted the shuffle/update_docs_21.06 branch May 26, 2021 13:33
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…2475)

* Update rapids-shuffle.md for UCX 1.10.1

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Add message around JUCX 1.11.0 compatibility warning whenusing with UCX 1.10

* Update minimum requirement. JUCX 1.11.0 requires UCX 1.10+

* Small tweaks

* Remove bullet points

* libnuma1 pulled automatically from apt install
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…2475)

* Update rapids-shuffle.md for UCX 1.10.1

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Add message around JUCX 1.11.0 compatibility warning whenusing with UCX 1.10

* Update minimum requirement. JUCX 1.11.0 requires UCX 1.10+

* Small tweaks

* Remove bullet points

* libnuma1 pulled automatically from apt install
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] update UCX documentation for branch 21.06
4 participants