A Diachronic Italian Corpus based on l’Unità

This repository contains data and code for the diachronic Italian Corpus based on l’Unità. The corpus was built by exploiting the digital archive of the newspaper "L’Unità". We automatically clean and annotate the corpus with PoS-tags, lemmas, named entities and syntactic dependencies.

Details about the corpus are reported in the following paper:

Pierpaolo Basile, Annalina Caputo, Tommaso Caselli, Pierluigi Cassotti, Rossella Varvara. A Diachronic Italian Corpus based on "L’Unità.". Proceedings of the 7th Italian Conference on Computational Linguistics (CLiC-it 2020), 2020. CEUR.org.

Please, cite the paper if you use our corpus.

The corpus is available here.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Diachronic Italian Corpus based on l’Unità

About

Releases

Packages

Languages

swapUniba/unita

Folders and files

Latest commit

History

Repository files navigation

A Diachronic Italian Corpus based on l’Unità

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages