Deep Learning for Big Crisis Data

This repository will host Python implementation of a number of deep neural networks classifiers for the classification of crisis-related data on Twitter.

Requirementes:

 python 2.7
 numpy, scikit-learn
 keras, tensorflow or theano backend

Dataset and Pre-process A sample of tweet data (data/sample.csv) is a .csv format with three columns

 First, we need to pre-process tweets data: remove urls, special characters, lowercasing…

 - python data_helpers/preprocess.py data/sample.csv
 
 Split pre-processed data (data/sample_prccd.csv) into train, test and dev part.

 - python data_helpers/split_data.py data/sample_prccd.csv

Training a neural net model

To train a classifier we create a folder containing links to train, test and dev part (data/nn_data)

Folder embeddings/ includes word vector file, we provide our pre-trained crisis word vectors, we also can use Google word embedding here

Folder dnn_scrips/ contains all neural nets models: CNN, RNN_LSTM, MLP…
- bash run_cnn.sh to train a model with different parameters.
See the results and training process in .log file
```
 Note that: if you do not have a GPU, you can run on CPU by change the theano flag to THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32
```

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
data_helpers		data_helpers
dnn_scripts		dnn_scripts
embeddings		embeddings
README.md		README.md
run_cnn.sh		run_cnn.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning for Big Crisis Data

About

Releases

Packages

Languages

nikolaradulov/deep-learning-for-big-crisis-data

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Big Crisis Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages