Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HDBSCAN package pin to 0.8.38 #5906

Merged
merged 10 commits into from
Aug 15, 2024

Conversation

divyegala
Copy link
Member

@divyegala divyegala commented May 24, 2024

HBSCAN has a breaking change in main branch, which cause wheel nightly tests to fail as cuML picks up HDBSCAN from main branch HEAD. As HDBSCAN also releases on pypi now, we can install release packages instead of installing HDBSCAN from git source.

This PR also updates HBDSCAN to v0.8.38.

@divyegala divyegala added bug Something isn't working non-breaking Non-breaking change labels May 24, 2024
@divyegala divyegala requested review from a team as code owners May 24, 2024 18:41
@github-actions github-actions bot added the Cython / Python Cython or Python issue label May 24, 2024
@github-actions github-actions bot added the conda conda issue label May 24, 2024
@dantegd dantegd added the 2 - In Progress Currenty a work in progress label May 31, 2024
@jameslamb
Copy link
Member

jameslamb commented Jul 16, 2024

This is failing because hdbscan pins to cython<3, while most of RAPIDS uses cython>=3.

  The conflict is caused by:
      The user requested cython>=3.0.0
      hdbscan 0.8.36 depends on cython<3 and >=0.27

(build link)

Hopefully we'll be able to resolve that by using the next hdbscan release... since @seberg relaxed that cython constraint for hdbscan in scikit-learn-contrib/hdbscan#644 🎉

@seberg
Copy link
Contributor

seberg commented Jul 17, 2024

Hopefully hdbscan will release soon, otherwise it seems like we may need to restore the pull from master (or commit hash) for both conda and pip for a while.

@trxcllnt
Copy link
Collaborator

trxcllnt commented Jul 22, 2024

hdbscan still has cython>=0.27,<3 in its top-level requirements.txt.

This is how cython>=0.27,<3 is ending up in install_requires and should be removed.

python/pyproject.toml Outdated Show resolved Hide resolved
@jakirkham
Copy link
Member

Also this is still targeting branch-24.06. Think this would need to be redirected to branch-24.10

@divyegala divyegala requested review from a team as code owners August 14, 2024 00:20
Copy link

copy-pr-bot bot commented Aug 14, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@divyegala divyegala changed the base branch from branch-24.06 to branch-24.10 August 14, 2024 00:20
@divyegala divyegala requested a review from a team as a code owner August 14, 2024 00:20
@divyegala divyegala requested a review from msarahan August 14, 2024 00:20
@divyegala divyegala removed request for a team August 14, 2024 00:21
@divyegala divyegala self-assigned this Aug 14, 2024
@divyegala
Copy link
Member Author

/ok to test

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Divye! 🙏

Had a suggestion below

Also can you please update this line?

- hdbscan<=0.8.30

dependencies.yaml Outdated Show resolved Hide resolved
@divyegala
Copy link
Member Author

/ok to test

@divyegala divyegala changed the title Use HDBSCAN pypi package for wheel builds and pin to 0.8.30 Use HDBSCAN package pin to 0.8.38 Aug 14, 2024
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Divye! 🙏

@jakirkham
Copy link
Member

jakirkham commented Aug 14, 2024

Are we using Cython in a PyTest plugin somewhere?

Seeing this error on CI

/opt/conda/envs/test/lib/python3.11/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: helpconfig, Hook: pytest_cmdline_parse
ModuleNotFoundError: No module named 'Cython'

Edit: Are we trying to install something from source (that uses Cython) at test time?

@divyegala divyegala requested a review from a team as a code owner August 14, 2024 18:36
@github-actions github-actions bot added the ci label Aug 14, 2024
@seberg
Copy link
Contributor

seberg commented Aug 15, 2024

Are we trying to install something from source (that uses Cython) at test time?

Yeah, the cython coverage needs Cython.

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2024

Thanks Sebastian! 🙏

That makes much more sense. So the Cython dependency just comes from the fact that cuML itself uses a Cython plugin for measuring test coverage. Looks like this has been around since code coverage began being measured ( #3338 )

Edit: Cython code coverage was introduced in PR: #3111

plugins = Cython.Coverage

Before, this dependency was not expressed in cuML's test requirement. However, as it was being implicitly handled via HDBSCAN, where HDBSCAN had an extraneous runtime dependency on Cython, things just quietly worked

With the recent HDBSCAN update, this extraneous Cython dependency was dropped ( scikit-learn-contrib/hdbscan#624 ). So we no longer get Cython pulled in transitively. Hence the error coming up now

To fix this, think we need to add cython to this list and run rapids-dependency-file-generator to propagate the change

cuml/dependencies.yaml

Lines 464 to 468 in 0fbd919

test_python:
common:
- output_types: [conda, requirements, pyproject]
packages:
- dask-ml

@github-actions github-actions bot removed the ci label Aug 15, 2024
@jakirkham
Copy link
Member

Looks like there was a network failure in one CI job:

[nltk_data] Downloading package treebank to /github/home/nltk_data...
[nltk_data] Error downloading 'treebank' from
[nltk_data]     <https://raw.githubusercontent.com/nltk/nltk_data/gh-
[nltk_data]     pages/packages/corpora/treebank.zip>:   <urlopen error
[nltk_data]     [Errno 104] Connection reset by peer>

Let's wait for the others to finish and we can retry just the failing job after

@jakirkham
Copy link
Member

Hooray all the others passed! 🥳

Restarting the one job that had a network failure

@jakirkham
Copy link
Member

Looks like all CI jobs passed! 🥳

@dantegd
Copy link
Member

dantegd commented Aug 15, 2024

/merge

@rapids-bot rapids-bot bot merged commit c7f53ef into rapidsai:branch-24.10 Aug 15, 2024
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currenty a work in progress bug Something isn't working conda conda issue Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants