- Modification of Siamese recurrent architectures for learning sentence similarity
- Simple ver.
- Char-CNN & MLP for Siamese Networks
-
Data
- In data, two questions are seperated by '\t'
-
Preprocessing
- Character Level (음소 or 음절)
- Digits and Specials
- For eumjeol(Syllable), use frequent 2350
-
Configuration
main.py
: main run file--epochs
: # of training epochs--batch
: Batch Size
--lr
: Learning rate--strmaxlen
: Maximum Limit of String Length--charsize
: Vocab Sizefilter_num
: # of Filter of one CNN Filter--emb
: Embedding Dimension--eumjeol
: Use Eumjeol(Syllable-level) if specifiedthreshold
: Threshold to determine Similar or not--model
: Model Selection (CNN, MLP)
- Set FC,layer and CNN layers in 'main.py'
- run 'main.py' with arguments as you wish