Skip to content

Support using external data, OpusFilter filtering and ELECTRA pretraining data

Latest
Compare
Choose a tag to compare
@jbrry jbrry released this 02 Aug 17:04
· 2 commits to external_data since this release

Wiki-BERT Pipeline modified to use additional data as well as data from the Wikipedia dump.

Also contains OpusFilter integration and generates ELECTRA pretraining data as well as BERT pretraining data.