tweet_learn

The purpose of this project is - using machine learning methods - to predict the following about tweets (posts on twitter):

Are tweets informative or non-informative?
Will a tweet be re-tweeted by another user?

Installation

1. Clone the repository

git clone https://github.com/bgold09/tweet_learn.git
cd tweet_learn

2. Install required Python packages

Use your preferred method (pip, apt-get, etc.) to install the following Python packages required by tweet learn:

3. Install Stanford Named Entity Recognizer (NER)

Download and unpack the Stanford Named Entity Recognizer:

wget http://nlp.stanford.edu/software/stanford-ner-2014-01-04.zip
unzip stanford-ner-2014-01-04.zip

Start a local NER java server (do this in a separate terminal window, as starting the process in the background will cause the server to function improperly):

java -mx1000m -cp stanford-ner.jar edu.stanford.nlp.ie.NERServer -loadClassifier classifiers/ner-eng-ie.crf-3-all2008-distsim.ser.gz -port 8080 -outputFormat inlineXML

4. Create a MySQL database and required tables

mysql -u <username> -p -e 'CREATE DATABASE twitter;'
mysql -u <username> -p twitter < data/users_backup.sql

From a python session:

>>> import tweet_learn as tl
>>> tl.store_initial_data("train_test_set")
>>> tl.add_centrality_feature("train_test_set")

5. Extract the data and targets

From a python session:

>>> ml = tl.extract_transform_data("train_test_set", 0, 1001)

6. Run tests

Check out confusion.py, score.py and roc.py for various methods for testing the quality of your models.

License

Licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
reports		reports
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
confusion.py		confusion.py
hashtag.py		hashtag.py
roc.py		roc.py
score.py		score.py
tweet_learn.py		tweet_learn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tweet_learn

Installation

1. Clone the repository

2. Install required Python packages

3. Install Stanford Named Entity Recognizer (NER)

4. Create a MySQL database and required tables

5. Extract the data and targets

6. Run tests

License

About

Releases

Packages

Contributors 2

Languages

License

bgold09/tweet_learn

Folders and files

Latest commit

History

Repository files navigation

tweet_learn

Installation

1. Clone the repository

2. Install required Python packages

3. Install Stanford Named Entity Recognizer (NER)

4. Create a MySQL database and required tables

5. Extract the data and targets

6. Run tests

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages