Skip to content

Topics include: K-means algorithm, initialization methods, normalization methods, internal validation and external validation.

Notifications You must be signed in to change notification settings

JoshSample/Data-Clustering-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Clustering-Project

This is a project repo for my Data Clustering class, phase 1 being an implementation of the K-Means algorithm. Phase 2 is normalization and initialization, where attributes are normalized using min/max normalization and clusters are initialized with random initial clusters instead of random initial centers. Phase 3 is internal validation, using Calinski-Harabasz and Silhouette Coefficient indices for finding optimal number of clusters. Phase 4 is external validation, using the Rand Statistic, Jaccard Coefficient, and Fowlkes-Mallows index for finding the best partition.

About

Topics include: K-means algorithm, initialization methods, normalization methods, internal validation and external validation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages