As a method to scale the prediction of products on marketplaces such as Amazon, we build an LSH pipeline to handle heavy recommendation ML techniques hidden within probabilistic algorithms, which, at a high probability, provides a recommendation close to its original algorithm. We showcase MinHash, modeling Jaccard Similarity, and SimHash, modeling Cosine Similarity on Embeddings.
Open the ipynb on Jupyter Server and run it. The notebook should be self-explanatory to use.
Done as part of the Probabilistic Algorithms course at Rice University. Done as a team of 3 - Hemanth Kumar Jayakumar, Sharath Giri and Anitesh Reddy