Wiki-BERT Pipeline modified to use additional data as well as data from the Wikipedia dump.
Also contains OpusFilter integration and generates ELECTRA pretraining data as well as BERT pretraining data.
Wiki-BERT Pipeline modified to use additional data as well as data from the Wikipedia dump.
Also contains OpusFilter integration and generates ELECTRA pretraining data as well as BERT pretraining data.