Training Scripts masked LM #263

friesel · 2021-08-10T14:21:43Z

Do you intend to publish the training scrips for the masked LM as well?

diegolascasas · 2021-08-10T16:33:09Z

Hi, can you specify which project you're directing your question to?

friesel · 2021-08-11T14:13:56Z

Sorry, my question is to the Perceiver IO-project-team.

In the NLP-world often the pretrained models are just english or "all the worlds languages". Many users however need inference in non-english languages and have 1 or 2 GPUs rather than TPU-pods, so for them it's most efficient to pretrain only in the language you actually need inference in. So both for pretraining and finetuning it'd be great to have the scripts you used in your pretraining of the masked LM available.

Thx

fding · 2021-08-12T11:07:48Z

Hi, thanks for your interest in Perceiver IO. We do not plan on open sourcing the training scripts for the masked LM, because the script is heavily tied to our internal infrastructure for training these models at scale. We do have an example training pipeline for ImageNet released as well as the exact configuration we used for language modeling from bytes (in the language modeling colab), which hopefully would be of use if you wish to train a new language model from scratch for other languages.

Do let us know if you have any further questions or if you encounter any issues trying to replicate our work!

friesel · 2021-08-16T09:46:18Z

Thx for the orientation. I will then get my head around the ImageNet-pipeline and try to adapt that to the NLP case.

codedecde · 2021-09-17T17:54:51Z

Hi @fding
Would it be possible to share some of the tensorboard logs for the Byte level LM pretraining and/or specifics on what the final MLM loss the models converge to(something similar to google-research/electra#3)? I am trying to replicate the Byte level experiments, so these logs would be really useful as a reference.
Thank you !

diegolascasas assigned diegolascasas and unassigned diegolascasas Aug 11, 2021

diegolascasas removed their assignment Oct 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Scripts masked LM #263

Training Scripts masked LM #263

friesel commented Aug 10, 2021

diegolascasas commented Aug 10, 2021

friesel commented Aug 11, 2021

fding commented Aug 12, 2021 •

edited

Loading

friesel commented Aug 16, 2021

codedecde commented Sep 17, 2021

Training Scripts masked LM #263

Training Scripts masked LM #263

Comments

friesel commented Aug 10, 2021

diegolascasas commented Aug 10, 2021

friesel commented Aug 11, 2021

fding commented Aug 12, 2021 • edited Loading

friesel commented Aug 16, 2021

codedecde commented Sep 17, 2021

fding commented Aug 12, 2021 •

edited

Loading