Lesel is a library for performing large scale semi-supervised machine learning on top of Hadoop. At the moment it contains code for
- Expectation-Maximization for mixtures of axis-aligned gaussians
- Stochastic gradient descent for the semi-supervised support vector machine (S3VM)
- Label propagation with a k-nearest neighbor affinity matrix
See the talk How to do Predictive Analytics with Limited Data at Hadoop World 2013 for an introduction to the algorithms used in Lesel.
Feel free to contact Ulrich Rueckert (urueckert at datameer.com).
The use and distribution terms for this software are covered by the Apache License 2.0 ( http://www.apache.org/licenses/) which can be found in the file LICENSE at the root of this distribution. By using this software in any fashion, you are agreeing to be bound by the terms of this license. You must not remove this notice, or any other, from this software.