-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDBSCAN not available #266
Comments
Interesting -- if you start Python and try:
...do you get no response (which is good!) or an error? |
Error indeed : > python
Python 3.7.16 (default, Mar 22 2023, 16:00:53)
[GCC 12.2.1 20230201] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hdbscan
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages/hdbscan/__init__.py", line 1, in <module>
from .hdbscan_ import HDBSCAN, hdbscan
File "/home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages/hdbscan/hdbscan_.py", line 21, in <module>
from ._hdbscan_linkage import (single_linkage,
File "hdbscan/_hdbscan_linkage.pyx", line 1, in init hdbscan._hdbscan_linkage
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject So it has something to do with numpy. I did try to install different versions of numpy and hdbscan corresponding to pixplot last release (2020). And during those tests I noticed this error: > pip install hdbscan==0.8.29
Collecting hdbscan==0.8.29
Using cached hdbscan-0.8.29-cp37-cp37m-linux_x86_64.whl
Collecting numpy>=1.20
Using cached numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Requirement already satisfied: scikit-learn>=0.20 in /home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages (from hdbscan==0.8.29) (0.24.2)
Requirement already satisfied: scipy>=1.0 in /home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages (from hdbscan==0.8.29) (1.4.0)
Requirement already satisfied: cython>=0.27 in /home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages (from hdbscan==0.8.29) (0.29.33)
Requirement already satisfied: joblib>=1.0 in /home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages (from hdbscan==0.8.29) (1.2.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/nicolas/.pyenv/versions/3.7.16/lib/python3.7/site-packages (from scikit-learn>=0.20->hdbscan==0.8.29) (3.1.0)
Installing collected packages: numpy, hdbscan
Attempting uninstall: numpy
Found existing installation: numpy 1.19.5
Uninstalling numpy-1.19.5:
Successfully uninstalled numpy-1.19.5
Attempting uninstall: hdbscan
Found existing installation: hdbscan 0.8.26
Uninstalling hdbscan-0.8.26:
Successfully uninstalled hdbscan-0.8.26
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.5.0 requires numpy~=1.19.2, but you have numpy 1.21.6 which is incompatible.
pixplot 0.0.113 requires numpy==1.19.5, but you have numpy 1.21.6 which is incompatible.
Successfully installed hdbscan-0.8.29 numpy-1.21.6
WARNING: You are using pip version 22.0.4; however, version 23.0.1 is available.
You should consider upgrading via the '/home/nicolas/.pyenv/versions/3.7.16/bin/python3.7 -m pip install --upgrade pip' command. pixplot has worked (with "hdbscan not available") with config |
I believe it has something to do with However, it seems that hdbscan takes into account the label/category column into the clustering, which is particularly interesting in my case. I believe the Am I missing something else without CUML ?
Thank you ! |
CUML is just a library that contains an accelerated implementation of UMAP; no worries there. You're correct that there are some real annoyances around numba and numpy; not sure if you're on Linux or not but there's some notes on the very end of this wiki page that might help: https://github.com/YaleDHLab/pix-plot/wiki/Ubuntu-20-&-22-with-GPU |
I am on Linux Manjaro, but I have a GPU Intel. Therefore, I am trying this, installing intel-extension-for-tensorflow 1.1.0, but it upgrades everything and breaks pixplot requirements. Again, GPU or speed is not crucial to me. It's rather hdbscan that could improve my clustering from what I understand. But maybe I am mistaking ? |
Hello, I have the following packages running python 3.7.16:
yet, pixplot gives me the following error when accessing my dataset and metadata csv:
I don't understand the errors neither why
HDBSCAN is not available
Thanks for your help!
The text was updated successfully, but these errors were encountered: