CNN framework for rumor detection

workflow:

Text->(word embedding through word2vec)-> convolution -> max pooling -> sentence feature + extra feature layer -> softmax

word embedding : use word2vec for training, weibo data filter length by 10, 5kw weibo
conv-net: use multiple filter size, each filter can get one feature through max-pooling keyphrase extraction: for each filter size, get the most selected phrase
feature combination: 300 sentence level feature + sentiment and word entitiy faeture

preprocess: CNNPreprocess.java , extra_feature: WeiboFeature/WeiboFeatureExtrator.java

word not in word2vec, initialize by uniform(-0.25,0.25)
vocabulary: 0 for NULL, and others start from 1 (0 also used for padding null words)
output pkfile: sentenses, word2vec, random_vectors, word->id, vocab, id->word
fist process must get the max_length and set for cnn

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README		README
avg_feature.py		avg_feature.py
cnn_predict.py		cnn_predict.py
conv_net_classes.py		conv_net_classes.py
conv_net_sentence.py		conv_net_sentence.py
conv_net_sentence_ori.py		conv_net_sentence_ori.py
global.conf		global.conf
process_data.py		process_data.py
process_data_17.py		process_data_17.py
process_data_rumor.py		process_data_rumor.py
readme.md		readme.md
run_predict.sh		run_predict.sh
run_train.sh		run_train.sh
run_train_ori.sh		run_train_ori.sh
split_events.py		split_events.py