Skip to content

XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 typologically diverse languages and includes nine tasks.

License

Notifications You must be signed in to change notification settings

blazejdolicki/xtreme

 
 

Repository files navigation

XTREME with many source languages

This is a fork of the original XTREME repository used for my bachelor thesis. We adjust the code to be able to use other source languages than English for three downstream tasks: UD POS, Panx (NER) and XNLI. Additionally, we make another dataset compatible with this benchmark - CLS+ (sentiment analysis).

Setup

Clone this repo and follow installation instructions from the original repository.

UD POS, Panx (NER) and XNLI

To train and evaluate models use the same commands as for the original repository (for example, >> bash scripts/train.sh xlm-roberta-large udpos). However, you can select training and testing languages by changing the TRAIN_LANGS and PRED_LANGS variables in train_udpos.sh,train_panx.sh and train_xnli.sh depending on which task you want to run. By default, we use all available language for a given dataset.

CLS+

To run experiments on CLS+ with the XLM-R (Large) model, execute:

>> bash scripts/train_cls.sh xlm-roberta-large

About

XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 typologically diverse languages and includes nine tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 56.8%
  • Python 43.2%