Data-Clustering-Project

This is a project repo for my Data Clustering class, phase 1 being an implementation of the K-Means algorithm. Phase 2 is normalization and initialization, where attributes are normalized using min/max normalization and clusters are initialized with random initial clusters instead of random initial centers. Phase 3 is internal validation, using Calinski-Harabasz and Silhouette Coefficient indices for finding optimal number of clusters. Phase 4 is external validation, using the Rand Statistic, Jaccard Coefficient, and Fowlkes-Mallows index for finding the best partition.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
CSCI4372_Data_Clustering		CSCI4372_Data_Clustering
Phase_1.pdf		Phase_1.pdf
Phase_2.pdf		Phase_2.pdf
Phase_3.pdf		Phase_3.pdf
Phase_4.pdf		Phase_4.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Clustering-Project

About

Uh oh!

Releases

Packages

Languages

JoshSample/Data-Clustering-Project

Folders and files

Latest commit

History

Repository files navigation

Data-Clustering-Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages