This repository contains an implementation of the principal component analysis in scala and spark. The PCA was part of an easy face-detector trained on the faces in the wild dataset. This implementation was part of a lecture in big-data analytics where a final project with free choice of the topic and used programming languages was mandatory. The implementation was constrained to run on the university cluster running the cloudera distribution of Spark in the version 1.6. This version was old at the time of the project and did not provide any functions to load and decode images. Therefore, the images where converted to grayscale and stored as csv-files with a python script and then loaded as text-files.

Check the file main.scala for the implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls