Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefer wheel-provided libraries, use RTLD_LOCAL #13

Merged
merged 5 commits into from
Nov 25, 2024

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Nov 14, 2024

Description

Contributes to rapidsai/build-planning#118

Modifies libucx.load_library() in the following ways:

  • prefer wheel-provided shared libraries to system installation
  • expose environment variable RAPIDS_LIBUCX_PREFER_SYSTEM_LIBRARY for switching that preference
  • load libraries with RTLD_LOCAL, to prevent adding symbols to the global namespace (dlopen docs)

Also updates all the pre-commit hook versions while I'm touching this repo. That was harmless, and I especially wanted to get up to the latest version of the RAPIDS copyright hook.

Notes for Reviewers

Version changes?

Proposing starting with 1.15.0.post2 because it's the oldest version supported at build and runtime by ucxx (code link).

And doing the following, in this order:

  1. merge this PR
  2. in a ucxx PR, test with these packages
  3. publish 1.15.0.post2 packages (via manually triggering CI run here)
  4. put up PRs to publish each of the other versions (https://pypi.org/project/libucx-cu12/#history)
    • 1.14.1.post2
    • 1.16.0.post2
    • 1.17.0.post1 (there was never a 1.17.0.post1)

@jameslamb jameslamb added DO NOT MERGE Hold off on merging; see PR for details breaking Introduces a breaking change improvement Improves an existing functionality labels Nov 14, 2024
@jameslamb jameslamb changed the title WIP: [DO NOT MERGE] prefer wheel-provided libraries, use RTLD_LOCAL prefer wheel-provided libraries, use RTLD_LOCAL Nov 15, 2024
@jameslamb jameslamb removed the DO NOT MERGE Hold off on merging; see PR for details label Nov 15, 2024
@jameslamb jameslamb marked this pull request as ready for review November 15, 2024 19:33
@jameslamb jameslamb requested a review from a team as a code owner November 15, 2024 19:33
@jameslamb jameslamb requested review from vyasr and pentschev November 15, 2024 19:34
VERSION Show resolved Hide resolved
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.14.0.post2

Why is this needed? UCXX and UCX-Py both currently require at least UCX 1.15. Are there other use cases where this is important?

As to other changes it looks fine to me, I'll not pretend I understand most of it, but the overall idea seem reasonable to me.

@jameslamb
Copy link
Member Author

@pentschev for why I'm proposing rebuilding 1.14 even though RAPIDS doesn't support it... same as the last time we talked about that: #6 (comment)

It's not that much incremental effort, and it ensures all the wheels work the same way regardless of which version you get. I don't feel too strongly about that, just proposing it because it's what we did the last time we had an update like this.

@pentschev
Copy link
Member

for why I'm proposing rebuilding 1.14 even though RAPIDS doesn't support it... same as the last time we talked about that: #6 (comment)

It's not that much incremental effort, and it ensures all the wheels work the same way regardless of which version you get. I don't feel too strongly about that, just proposing it because it's what we did the last time we had an update like this.

I see, I would say I understand but I don't know if it's a great idea to keep on changing older versions. For instance, we may break things we now don't test anymore and that were previously working when we tested them. Similar to what we do with RAPIDS, I think it's probably ok to accept the past state of potentially broken libraries. IOW, I think we're fine doing it this once but I'm less convinced we should keep on doing that in the future, open to divergent opinions though.

@vyasr vyasr requested a review from trxcllnt November 18, 2024 19:07
@vyasr
Copy link
Contributor

vyasr commented Nov 18, 2024

I'm inclined to agree with Peter here. After this change I suggest that we stop pushing fixes for older versions than we support in RAPIDS until someone specifically asks for them.

@vyasr
Copy link
Contributor

vyasr commented Nov 18, 2024

@trxcllnt if everything looks fine to you here feel free to merge.

@vyasr vyasr merged commit 9c29f60 into rapidsai:main Nov 25, 2024
13 checks passed
@jameslamb jameslamb deleted the library-loading branch November 25, 2024 19:16
@jameslamb
Copy link
Member Author

jameslamb commented Nov 25, 2024

Thanks!

@vyasr I can handle repeating this for the other versions:

  • 1.14.1.post2
  • 1.16.0.post2
  • 1.17.0.post1 (there was never a 1.17.0.post1)

@jameslamb jameslamb mentioned this pull request Nov 26, 2024
@jameslamb
Copy link
Member Author

Tested with a ucxx PR and confirmed that its CI passes with these changes: rapidsai/ucxx#334 (comment)

So I think we're good to proceed.

This was referenced Nov 26, 2024
trxcllnt pushed a commit to rapidsai/devcontainers that referenced this pull request Nov 27, 2024
Contributes to rapidsai/build-planning#118

Should only be merged if / after we merge
rapidsai/ucx-wheels#13. That PR switches
`libucx` wheels' loading behavior, such that `libucx.load_library()`
(which `ucx-py` and `libucxx` call at import time) prefers the
`libuc{m,p,s}.so` provided by the `libucx` wheels.

rapidsai/ucx-wheels#13 introduces an environment
variable to control that... this proposes setting that, to continue
preferring system-installed UCX libraries in devcontainers.

---------

Co-authored-by: Vyas Ramasubramani <vyasr@nvidia.com>
rapids-bot bot pushed a commit to rapidsai/ucx-py that referenced this pull request Dec 16, 2024
#1099)

Contributes to rapidsai/build-planning#118

Caused by rapidsai/ucx-wheels#13

I originally came here to document the implications of rapidsai/ucx-wheels#13 in the docs, namely:

* if you have a `libucx-cu{11,12}` wheel installed, then by default `ucx-py` will use UCX libraries from that wheel
* environment variable `RAPIDS_LIBUCX_PREFER_SYTEM_LIBRARY=true` can be set to opt out of this and use a system installation instead

While doing that, I noticed some other opportunities for improvement in the installation docs:

* updating build-UCX-from-source instructions to UCX 1.15 ([the oldest version this project now supports](https://github.com/rapidsai/ucx-py/blob/9efacc6069226de8e207177a359189f8880203a8/dependencies.yaml#L159))
* clarifying and simplifying some language

## Notes for Reviewers

### How I tested this

Followed these instructions in a Docker container running on an x86_64 machine with 8 V100s.

```shell
docker run \
    --rm \
    --gpus 0 \
    -v $(pwd):/opt/work \
    -w /opt/work \
    -it rapidsai/ci-conda:latest \
    bash
```

Used `conda` to set up the build environment:

```shell
conda create -n ucx -c conda-forge \
    automake make libtool pkg-config \
    "python=3.12" "setuptools>=64.0" "cython>=3.0.0" \
    cuda-nvcc \
    cuda-cudart-dev \
    cuda-nvml-dev \
    cuda-nvtx-dev \
    cuda-version=12.5
```

Ran variations of this code snippet to test my install:

```shell
python -c "import ucp; print(ucp.get_ucx_version())"
```

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #1099
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Introduces a breaking change improvement Improves an existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants