Partially Shuffling the Training Data to Improve Language Models

This repository contains the code for the Partial Shuffle method, and a modified version of the DOC language model that utilizes this method.

If you'd like to run the DOC + Partial Shuffle models, use the same commands as in the original DOC model, presented here.

The code for the Partial Shuffle method itself is in partial_shuffle.py. If you'd like to use this method in your own language model, simply import partial_shuffle.py, and call it before each epoch, as in line 196 in main.py. No other modifications are required.

Reference

If you found this code useful, please cite the following paper:

@article{press2019partially,
  title={Partially Shuffling the Training Data to Improve Language Models},
  author={Press, Ofir},
  journal={arXiv preprint arXiv:1903.04167},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Partially Shuffling the Training Data to Improve Language Models

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Partially Shuffling the Training Data to Improve Language Models

Reference