Skip to content

Enseble- and Distance-based Feature Ranking and Selection for Unsupervised Learning

Notifications You must be signed in to change notification settings

Petkomat/unsupervised_ranking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ensemble- and Distance-based Feature Ranking and Selection for Unsupervised Learning

This repository contains the code for unsupervised feature ranking. We implement two approaches:

  • Ensemble-based feature rankings (computed from ensemlbes of predictive clustering trees),
  • Distance-based feature rankings (defined by the unsupervised Relief algorithm).

Both approaches follow the paradigm of predictive clustering, as implemented in Clus (written in java).

Examples

The code is easy to use! For example, once we have our data stored in a numpy array x, we simply call

e_ranking_1 = EnsembleRanking()
scores_1 = e_ranking_1.fit(x)

or

e_ranking_2 = EnsembleRanking(score=["Genie3", "RForest"], ensemble="ExtraTrees", ensemble_size=[2, 3])
scores_2 = e_ranking_2.fit(x)

if we want to explicitly set the parameters. Similarly for URelief rankings:

relief_1 = URelief()
scores_1 = relief_1.fit(x)

relief_2 = URelief(iterations=[0.25, 0.9, 1.0], neighbours=[10, 15, 2000])
scores_2 = relief_2.fit(x)

For more examples, see src/example.py

Requirements

The code requires

  • Python 3.6 or higher, together with numpy and scipy,
  • Java 1.8 (because internally, Clus.jar is used).

License and citation

The code is under the CC BY-NC 4.0 licence. In brief, this means you can use the code for noncomercial purposes, provided you give us some greatly appreciated credit by citing

Petković, M, Kocev, D, Škrlj, B, Džeroski, S. Ensemble‐ and distance‐based feature ranking for unsupervised learning. Int J Intell Syst. 2021; 1– 19. https://doi.org/10.1002/int.22390

with bibtex

@misc{petkovic2021,
      title={Ensemble- and Distance-Based Feature Ranking for Unsupervised Learning}, 
      author={Matej Petkovi\{c} and Dragi Kocev and Bla\v{z} \v{S}krlj and Sa\v{s}o D\v{z}eroski},
      journal = {International Journal of Intelligent Systems},
      volume = {n/a},
      number = {n/a},
      pages = {1--19},
      keywords = {extra trees, feature ranking, relief, tree ensembles, unsupervised learning},
      doi = {https://doi.org/10.1002/int.22390},
      url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/int.22390},
      eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/int.22390},
}

About

Enseble- and Distance-based Feature Ranking and Selection for Unsupervised Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages