You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the cuML implementation of HDBSCAN for clustering and would like to ensure reproducibility across multiple runs. Is there currently any support for setting a random seed (e.g., via a random_state parameter) in the HDBSCAN algorithm to make the results deterministic?
If not, is there any plan to introduce such a feature in future releases?
The text was updated successfully, but these errors were encountered:
@MohabGhobashy Could you please explain your use-case? How different are your results across different runs? Please provide a minimal reproducer also if you have one.
In general, it's hard to provide exact reproducibility in highly parallel environments.
I am using the cuML implementation of HDBSCAN for clustering and would like to ensure reproducibility across multiple runs. Is there currently any support for setting a random seed (e.g., via a random_state parameter) in the HDBSCAN algorithm to make the results deterministic?
If not, is there any plan to introduce such a feature in future releases?
The text was updated successfully, but these errors were encountered: