GitHub - crawles/gpdb_sentiment_analysis_twitter_model: Build a sentiment classifier using PL/Python on PostgreSQL, Greenplum Database, or Apache HAWQ

Sentiment Classifer in PL/Python

The build-sentiment-classifier.ipynb Jupyter Notebook builds and exports a serialized Twitter sentiment classifier twitter_sentiment_model.pkl using PL/Python for PostgreSQL, Greenplum Database, or Apache HAWQ. The classifier is based on the approach of Go et al using the Sentiment140 data. The data can be downloaded from the Sentiment140 website.

The classifier has an accuracy of 80% on the test dataset consisting of several hundred annotated tweets. The training set consists of 1.6 million tweets automatically labeled by assuming that any tweet with positive emoticons, like :), were positive, and tweets with negative emoticons, like :(, were negative. This technique is called distant supervision using emoticons as noisy labels.

Additional Resources

Author

Chris Rawles

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
build-sentiment-classifier.ipynb		build-sentiment-classifier.ipynb
params.py		params.py
twitter_sentiment_model.pkl		twitter_sentiment_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Classifer in PL/Python

Additional Resources

Author

About

Releases

Packages

Languages

License

crawles/gpdb_sentiment_analysis_twitter_model

Folders and files

Latest commit

History

Repository files navigation

Sentiment Classifer in PL/Python

Additional Resources

Author

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages