This contains the code used to simulate active sample selection on various datasets, all included in the Matlab/data folder.
The full report can be found in the Report directory under thesis.pdf or here.
This project builds on top of a recommender system for the purpose of active learning.
An example of a use case is to assume we have a user-movie database in matrix form, with many incomplete entries. From the current data set we may know that users liking Star Wars 1 are very likely to like Star Wars 2. If we are given the opportunity to ask any user his opinion on a film, asking a user his opinion will on Star Wars 1 or 2 may not be very useful in terms of data prediction. Instead it may be worth asking a user his opinion on Pulp Fiction. This corresponds to an empty row-colum entry in the origial dataset. The aim of this project is thus to determine mathematical criteria which is useful in locating the best row-colum coordinate to query when allowed to build up the size of the dataset.
References for non-original code is found in the report references.