Releases: retarfi/language-pretraining
Releases · retarfi/language-pretraining
v2.2.1
- Able to select sentencepiece algorithm
- Able to use multiprocessing in create_datasets.py
- Move ELECTRA model file into models directory
- Add DeBERTaV3 (alpha) implementation
- This implementation does back propagation of generator and discriminator at the same time
- In my experiment, models from this implementation perform worse than the models with my DeBERTaV2 implementation
- So the implementation needs to be improved, however, I don't have time to put effort into this.
v2.2.0
Main changes are following:
- jptranstokenizer is used for tokenizer
- It enables other word tokenizers such as Juman++, Sudachi, and spacy LUW.
- requirements.txt to pyproject.toml
- This is unstable, especially the PyTorch part, and should be changed according to your own environment.
- If you get an error in run_pretraining.py, it may be due to pydantic. updating pydandic to the latest version may solve the problem, although the compatibility does not match.
- Add Pre-mask option
- To use this option, please specify
--mask_style
and use--is_dataset_masked
option in run_pretraining.py.
- To use this option, please specify
- Add DeBERTa and DeBERTaV2
- Change license from Apache 2.0 to MIT
There are more changes in detail.
Please read Readme.md.
v2.1.0
- Add RoBERTa and DeBERTa architecture (not confirmed, only added) (8fce71d9d14f974f445f10b632d4c57dd984ee5a
- Update dataset construction
- Deal with wasting memory when pre-training (358aa61087c6712fa0c85afc35892a4d2f862a9e and 7fa51594249fe206e8aa059c94bf7eee130825f9)
- More effective linebyline 3bb47ea8f446e0e33a5e82143bd2f7393867e192
v2.0.0
Apply Hugging Face's datasets library
https://github.com/retarfi/language-pretraining/tree/336c3699679dd59be788acc21f83188efa76b95b
New features:
- Apply datasets library
- You need to run create_datasets.py before running run_pretraining.py
- Check README.md#Create Dataset for how to run create_datasets.py
- Log losses of discriminator and generator of ELECTRA
- Additional pre-training from a checkpoint is avaiable
- Check README.md#Additional Pre-training for setting in detail