transfromer_NN_Block

We are doing this to implemented transformer as a neural network building block to overcome several task in NLP research, this rep follow the raw paper realization of Attention Is All You Need.

This rep achieved Several tasks:

The seq2seq text generation, we try to implemented transformer solve a conventional problem in NLP - Words segmentation(Chinese).
The NMT problem track on Chinese-English machine translation with WIT3 datasets.
The language model encoder architecture for Text-classfication.
The sentence entailement task experiment with stanford SNLI datasets(Natural language Inference).
Updated reading comprehension task.

INSTALL ENV:

Please run pip install -r requirements.txt first.

First- the encoder-decoder architectures.

train

-The aim is train a sequence labeling model with Transformer. We follow the conventional sentence tokenize method - /B/E/S/M (represent the word begin/end/single word/in the middle respectively).

We used some labeled chinese Ducuments to train my model. The raw data presented in the ./transformer_jieba/dataset dir. Or you may want use the ./transformer_jieba/prepro.py to preprocess the raw data.
Just use the python train.py to train the model.

eval

Run python eval.py, We achieved the BLEU score nearly 80.

Second - zh-en NMT

the train and test data was from Web Inventory of Transcribed and Translated Talks-WIT3, we train a model for English-Chinese translation model(data source).
test Result:

Third - the transformer feature extraction block

you may find the code in ./transformer_text_Classfication, codes about preprocessing and training as well as evaluation locate in this path. And the wrappers usage are similar to encoder-decoder architecture.
The chinese corpus was downloaded from THUCTC(THU Chinese Text Classification), and we show better macro avg f1-score with over 0.05.
Our model is very raw and shallow(only 8 multi-head attention projection and final linear projection) and without pre-trained embedding, you can explore performance with our code.

result of chinese sentences classfication(char-level)

tagging = {'时尚':0, '教育':1, '时政':2, '体育':3, '游戏':4, '家居':5, '科技':6, '房产':7, '财经':8, '娱乐':9}

              precision    recall  f1-score   support

           0       0.91      0.95      0.93      1000
           1       0.96      0.77      0.85      1000
           2       0.92      0.93      0.92      1000
           3       0.95      0.93      0.94      1000
           4       0.86      0.91      0.88      1000
           5       0.83      0.47      0.60      1000
           6       0.86      0.85      0.86      1000
           7       0.64      0.87      0.74      1000
           8       0.79      0.91      0.85      1000
           9       0.88      0.91      0.89      1000

    accuracy                           0.85     10000
   macro avg       0.86      0.85      0.85     10000
weighted avg       0.86      0.85      0.85     10000

Done

We Also implemented a sentences entailment inference task with transformer

Data source standord SNLI

Download source data and unzip : wget https://nlp.stanford.edu/projects/snli/snli_1.0.zip && unzip snli_1.0.zip
preprocess data: python data_prepare.py && python prepro.py
train: run python train.py
eval: run python eval.py --task infersent

Experiment result:

train accuracy:
train loss:
eval result:

              precision    recall  f1-score   support

           0       0.82      0.76      0.79      3358
           1       0.77      0.80      0.79      3226
           2       0.70      0.73      0.72      3208

    accuracy                           0.76      9792
   macro avg       0.76      0.76      0.76      9792
weighted avg       0.76      0.76      0.76      9792

Ref

https://github.com/Kyubyong/transformer
Attention Is All You Need.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

README.md

README.md

transfromer_NN_Block

INSTALL ENV:

First- the encoder-decoder architectures.

train

eval

Second - zh-en NMT

Third - the transformer feature extraction block

result of chinese sentences classfication(char-level)

We Also implemented a sentences entailment inference task with transformer

Ref

Collapse file tree

Files

README.md

Latest commit

History

README.md

File metadata and controls

transfromer_NN_Block

INSTALL ENV:

First- the encoder-decoder architectures.

train

eval

Second - zh-en NMT

Third - the transformer feature extraction block

result of chinese sentences classfication(char-level)

We Also implemented a sentences entailment inference task with transformer

Ref