Coursework completed as part of the Methods for Data Science course at Imperial College London.
Exploration of two different high-dimensional data sets - the MNIST-Fashion dataset, and a dataset of text documents with citations. The approaches considered are:
- Supervised classification (Multi-Layer Perceptron, Convolutional Neural Networks, K-Nearest Neighbours)
- Graph analytics (Community Detection)
- Unsupervised learning (K-Means Clustering)
- Dimensionality reduction (Principal Component Analysis, Non-Negative Matrix Factorisation, Latent-Dirichlet Allocation)
- K-Means Clustering
- Graph Analytics - Community Detection and Centrality Measurements
- Comparison between Communities and Clusters
- K-Nearest Neighbours
- Multi-Layer Perceptron
- Convolutional Neural Networks
- Evaluation and Comparison
- A poster visualising and explaining results.
- Principal Component Analysis
- Non-Negative Matrix Factorisation
- Latent-Dirichlet Allocation