Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

This project involved creating a Naive Bayes model from scratch and implementing a BERT model using the PyTorch pre-trained BERT library to classify IMDB movie reviews as positive or negative. The text data was preprocessed differently for the two models, with the Bayes model using lemmatization and the BERT model using tokenization. The BERT model outperformed the Bayes model in all metrics, indicating it is a more accurate and reliable model for this task. Pre-training on an external corpus, like BERT does, can be beneficial for movie review prediction by allowing the model to learn general language representations that can be fine- tuned on a specific task. Deep learning methods tend to perform better than simpler machine learning methods (such as Naive Bayes) on complex tasks that require feature extraction. While simpler machine learning methods can be effective on easier tasks that require less computational resources. The choice between deep learning and traditional machine learning methods depends on the specific task and available resources.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
BERT.ipynb		BERT.ipynb
Comparing_Naive_Bayes_and_BERT_models_for_Classification_of_Textual_Data.pdf		Comparing_Naive_Bayes_and_BERT_models_for_Classification_of_Textual_Data.pdf
README.md		README.md
naive_bayes.ipynb		naive_bayes.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

About

Releases

Packages

Languages

MilesWeberman/Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

Folders and files

Latest commit

History

Repository files navigation

Naive-Bayes-and-BERT-models-for-Classification-of-Textual-Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages