Radius clustering is a Python package that implements clustering under radius constraint based on the Minimum Dominating Set (MDS) problem. This problem is NP-Hard but has been studied in the literature and proven to be linked to the clustering under radius constraint problem (see references for more details).
- Implements both exact and approximate MDS-based clustering algorithms
- Compatible with scikit-learn's API for clustering algorithms
- Supports radius-constrained clustering
- Provides options for exact and approximate solutions
- Easy to use and integrate with existing Python data science workflows
- Includes comprehensive documentation and examples
- Full test coverage to ensure reliability and correctness
- Supports custom MDS solvers for flexibility in clustering approaches
- Provides a user-friendly interface for clustering tasks
Caution
Deprecation Notice: The threshold
parameter in the RadiusClustering
class has been deprecated. Please use the radius
parameter instead for specifying the radius for clustering. It is planned to be completely removed in version 2.0.0. The radius
parameter is now the standard way to define the radius for clustering, aligning with our objective of making the parameters' name more intuitive and user-friendly.
Note
NEW VERSIONS: The package is currently under active development for new features and improvements, including some refactoring and enhancements to the existing codebase. Backwards compatibility is not guaranteed, so please check the CHANGELOG for details on changes and updates.
- Version 1.4.0:
- Add support for custom MDS solvers
- Improve documentation and examples
- Add more examples and tutorials
You can install Radius Clustering using pip:
pip install radius-clustering
Here's a basic example of how to use Radius Clustering:
import numpy as np
from radius_clustering import RadiusClustering
# Example usage
X = np.random.rand(100, 2) # Generate random data
# Create an instance of MdsClustering
rad_clustering = RadiusClustering(manner="approx", radius=0.5)
# Fit the model to the data
rad_clustering.fit(X)
# Get cluster labels
labels = rad_clustering.labels_
print(labels)
You can find the full documentation for Radius Clustering here.
To build the documentation, you can run the following command, assuming you have all dependencies needed installed:
cd docs
make html
Then you can open the index.html
file in the build
directory to view the full documentation.
For more information please refer to the official documentation.
If you want insights on how the algorithm works, please refer to the presentation.
If you want to know more about the experiments conducted with the package, please refer to the experiments.
Contributions to Radius Clustering are welcome!
Please read the CONTRIBUTING.md file for details on how to contribute to the project. Please note that the project is released with a Code of Conduct, and we expect all contributors to adhere to it.
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
If you use Radius Clustering in your research, please cite the following paper and the software itself:
@inproceedings{haenn_clustering2024,
TITLE = {{Clustering Under Radius Constraints Using Minimum Dominating Sets}},
AUTHOR = {Haenn, Quentin and Chardin, Brice and Baron, Micka{\"e}l},
URL = {https://hal.science/hal-04533921},
BOOKTITLE = {{Lecture Notes in Artificial Intelligence}},
ADDRESS = {Poitiers, France},
PUBLISHER = {{Springer}},
YEAR = {2024},
MONTH = Jun,
KEYWORDS = {Constrained Clustering ; Radius Based Clustering ; Minimum Dominating Set ; Constrained Clustering Radius Based Clustering Minimum Dominating Set},
PDF = {https://hal.science/hal-04533921v1/file/clustering_under_radius_using_mds.pdf},
HAL_ID = {hal-04533921},
HAL_VERSION = {v1},
}
The two MDS algorithms implemented are forked and modified (or rewritten) from the following authors:
- Alejandra Casado for the minimum dominating set heuristic code [1]. We rewrote the code in C++ to adapt to the need of python interfacing.
- Hua Jiang for the minimum dominating set exact algorithm code [2]. The code has been adapted to the need of python interfacing.
The Radius Clustering work has been funded by:
- Quentin Haenn (core developer), LIAS, ISAE-ENSMA
- Brice Chardin, LIAS, ISAE-ENSMA
- Mickaël Baron, LIAS, ISAE-ENSMA