GitHub - ixa-ehu/ner-evaluation-corpus-europarl: Manually annotated test set from Europarl for Named Entity Recognition

Evaluation Corpus for Named Entity Recognition using Europarl

This repository contains a gold-standard test set created from the Europarl corpus. The test set consists of 799 sentences manually annotated using four entity types and following the CoNLL 2002 and 2003 guidelines for 4 languages: English, German, Italian and Spanish.

In order to obtain the final 799 annotated sentences for each language, the first 2000 sentence from the Europarl corpus were used for each of the languages. The word alignments were obtained via Giza++, which was trained on the rest of the Europarl corpus (e.g., Europarl minus the first 2000 sentences).

If you use this corpus for your research, please cite the following publication:

Rodrigo Agerri, Yiling Chung, Itziar Aldabe, Nora Aranberri, Gorka Labaka and German Rigau (2018). Building Named Entity Recognition Taggers via Parallel Corpora. In Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 7-12 May, 2018, Miyazaki, Japan.

You should also consider citing the original Europarl publication:

Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, MT Summit 2005.

This evaluation corpus was manually annotated by Nora Aranberri.

License

We follow the original Europarl terms of use which states : "We are not aware of any copyright restrictions of the material." For more details, please visit http://www.statmt.org/europarl/

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
de-europarl.test.conll02		de-europarl.test.conll02
en-europarl.test.conll02		en-europarl.test.conll02
es-europarl.test.conll02		es-europarl.test.conll02
it-europarl.test.conll02		it-europarl.test.conll02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluation Corpus for Named Entity Recognition using Europarl

Contents

License

About

Releases

Packages

Contributors 2

ixa-ehu/ner-evaluation-corpus-europarl

Folders and files

Latest commit

History

Repository files navigation

Evaluation Corpus for Named Entity Recognition using Europarl

Contents

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages