Data Preprocessing

This repository contains the code for our COLING 2018 paper:

Dynamic Multi-Level Multi-Task Learning for Sentence Simplification.

Data Preprocessing

Please follow the instructions from Zhang et al. 2017 for downloading the pre-processed dataset. To build the .bin files please follow the instructions from See et al. 2017, or here.

Evaluation Set-Up

Please follow the instructions from Zhang et al. 2017 for setting up the evaluation system.
FKGL implementations can be found in this repo.
Modify corresponding directories in evaluation_utils/sentence_simplification.py.
Please note that evaluation metrics are calculated on corpus level.

Dependencies

python 2.7
tensorflow 1.4

Usage

CUDA_VISIBLE_DEVICES="GPU_ID" python run.py \
    --mode "string" \
    --vocab_path "/path/to/vocab/file" \
    --train_data_dirs "/path/to/trainig/data_1,/path/to/trainig/data_2,/path/to/trainig/data_3" \
    --val_data_dir "/path/to/validation/data_1" \
    --decode_data_dir "/path/to/decode/data_1" \
    --eval_source_dir "/path/to/validation/data_1.source" \
    --eval_target_dir "/path/to/validation/data_1.target" \
    --max_enc_steps "int" --max_dec_steps "int" --batch_size "int" --steps_per_eval "int" \
    --log_root "/path/to/log/root/" --exp_name "string" [--autoMR] \
    --lr "float" --beam_size "int" --soft_sharing_coef "float"  --mixing_ratios "mr_1,mr_2"\
    --decode_ckpt_file "/path/to/ckpt" --decode_output_file "/path/to/file"

Pretrained models can be found here.

Citation

@inproceedings{guo2018dynamic,
    title = {Dynamic Multi-Level Multi-Task Learning for Sentence Simplification},
    author = {Han Guo and Ramakanth Pasunuru and Mohit Bansal},
    booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018)},
    year = {2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
evaluation_utils		evaluation_utils
multitask		multitask
pointer_model		pointer_model
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Preprocessing

Evaluation Set-Up

Dependencies

Usage

Citation

About

Releases

Packages

Languages

License

HanGuo97/MultitaskSimplification

Folders and files

Latest commit

History

Repository files navigation

Data Preprocessing

Evaluation Set-Up

Dependencies

Usage

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages