Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: split wheel into libcuml and cuml #6006

Closed
wants to merge 1 commit into from

Conversation

msarahan
Copy link
Contributor

@msarahan msarahan commented Aug 1, 2024

This is part of rapidsai/build-planning#33

This PR has symbol visibility considerations, as detailed in https://nvidia.slack.com/archives/C07E95NRXJM/p1722468281754129?thread_ts=1722462671.639599&cid=C07E95NRXJM

The end goal will be to have this libcuml wheel depend on the libraft wheel, but we don't strictly need that to have useful libcuml wheels.

@jameslamb
Copy link
Member

closing this as stale... I've linked to it from the issue tracking this work, so it can be used as a reference. But when this is picked up, I think it'll be easier to start from a new PR.

@jameslamb jameslamb closed this Dec 17, 2024
rapids-bot bot pushed a commit that referenced this pull request Jan 24, 2025
Replaces #6006, contributes to rapidsai/build-planning#33.

Proposes packaging `libcuml` as a wheel, which is then re-used by `cuml-cu{11,12}` wheels.

## Notes for Reviewers

### Benefits of these changes

* smaller wheels (see "Size Changes" below)
* faster compile times
  - *no more re-compiling RAFT, thanks to rapidsai/raft#2531
* less use of CI resources (only compiling once per CPU architecture / CUDA versions, instead of once per those + Python minor version)
* other benefits mentioned in rapidsai/build-planning#33

### Wheel contents

`libcuml`:

* `libcuml++.so` (shared library) and its headers
* `libcumlprims_mg.so` (shared library) and its headers
* other vendored dependencies (CCCL, `fmt`)

`cuml`:

* `cuml` Python / Cython code and compiled Cython extensions

### Dependency Flows

In short.... `libcuml` contains `libcuml.so` and `libcumlprims_mg.so` dynamic libraries and the headers to link against them.

* Anything that needs to link against cuML at build time pulls in `libcugraph` wheels as a build dependency.
* Anything that needs cuML's symbols at runtime pulls it in as a runtime dependency, and calls `libcuml.load_library()`.

For more details and some flowcharts, see rapidsai/build-planning#33 (comment)

### Size changes (CUDA 12, Python 3.12, x86_64)

| wheel                | num files (before) | num files (this PR) | size (before)  | size (this PR) |
|:---------------:|------------------:|-----------------:|--------------:|-------------:|
| `libcuml`           |   ---                       |   1766                   | ---                   | 289M                 |
| `cuml`               |   442                     |   441                    | 527M               | 9M                 |
|**TOTAL**          |   **442**              |   **2207**               | **527M**        | **298M**    |

*NOTES: size = compressed, "before" = 2025-01-22 nightlies*

<details><summary>how I calculated those (click me)</summary>

```shell
docker run \
    --rm \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2025-01-22 \
    --env CUML_NIGHTLY_SHA=01e19bba9821954b062a04fbf31d3522afa4b0b1 \
    --env CUML_PR="pull-request/6199" \
    --env CUML_PR_SHA="9d5100ec4589e20230a31817518427efa1e49c6d" \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.12 \
    bash

# --- nightly wheels --- #
mkdir -p ./wheels-before

export RAPIDS_BUILD_TYPE=branch
export RAPIDS_REF_NAME="branch-25.02"

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_SHA=${CUML_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# --- wheels from CI --- #
mkdir -p ./wheels-after

export RAPIDS_BUILD_TYPE="pull-request"

# libcuml
RAPIDS_PY_WHEEL_NAME="libcuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 cpp ./wheels-after

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

pip install pydistcheck
pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-before/*.whl \
| grep -E '^checking|files: | compressed' \
> ./before.txt

# get more exact sizes
du -sh ./wheels-before/*

pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-after/*.whl \
| grep -E '^checking|files: | compressed' \
> ./after.txt

# get more exact sizes
du -sh ./wheels-after/*
```

</details>

### How I tested this

These other PRs:

* rapidsai/devcontainers#442

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Divye Gala (https://github.com/divyegala)

URL: #6199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake Cython / Python Cython or Python issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants