Skip to content

MilesWeberman/Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

Repository files navigation

Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

This project involved creating a Naive Bayes model from scratch and implementing a BERT model using the PyTorch pre-trained BERT library to classify IMDB movie reviews as positive or negative. The text data was preprocessed differently for the two models, with the Bayes model using lemmatization and the BERT model using tokenization. The BERT model outperformed the Bayes model in all metrics, indicating it is a more accurate and reliable model for this task. Pre-training on an external corpus, like BERT does, can be beneficial for movie review prediction by allowing the model to learn general language representations that can be fine- tuned on a specific task. Deep learning methods tend to perform better than simpler machine learning methods (such as Naive Bayes) on complex tasks that require feature extraction. While simpler machine learning methods can be effective on easier tasks that require less computational resources. The choice between deep learning and traditional machine learning methods depends on the specific task and available resources.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published