Predicting Fine-Grained Sentiments For Scraped Amazon Reviews Using SVM and FastText Models Trained On Stanford NLP Treebank :
- contains the code to scrape Amazon Reviews.
- contains the code to generate train ('sst_train.txt'), dev ('sst_dev.txt'), and test ('sst_test.txt') files and perform EDA.
- contains the code to train and predict using SVM model and store as a CSV ('svm_predicted_sentiments.csv').
- contains the code to train FastText model and store non-quantized('sst.bin') as well as quantized('sst_quantized.ftz') models.
- contains the code to predict sentiments using FastText for Amazon Reviews and store as a CSV ('fastText_predicted_sentiments.csv').
- contains the code to Visualize Results.
- contains the Scraped Reviews.
- contains the sentiments predicted using SVM.
- contains the sentiments predicted using FastText.
Run this code from the command line to run 'amazon_review.py' and store results as 'customer_reviews.csv'
scrapy runspider amazon_review.py -o customer_reviews.csv
Use the following code to train and predict using SVM Model.
python svm_train_and_predict.py
After running this file, there is a 'svm_predicted_sentiments.csv' file generated containing the predicted sentiments.
Use the following code to train FastText Model. It takes around 3-5 minutes on CPU to complete training.
python train_fasttext_sentiment_analysis.py
After training, there will be a model saved as 'sst.bin' and a quantized model saved as 'sst_quantized.ftz'.
Use the following code to test the quantized FastText Model.
python fasttext_predict_sentiment.py
This code will output a 'fastText_predicted_sentiments.csv' file containing the predicted sentiments.
Thus, it is clearly evident that SVM outperforms FastText for the test dataset containing reviews scraped from Amazon!
For any query/feedback, please contact:
Lakshay Mehra: mehralakshay2@gmail.com