Skip to content
stephantul edited this page Feb 21, 2020 · 3 revisions

The pattern.it module contains a fast part-of-speech tagger for Italian (identifies nouns, adjectives, verbs, etc. in a sentence) and tools for Italian verb conjugation and noun singularization & pluralization.

It can be used by itself or with other pattern modules: web | db | en | search | vector | graph.


Documentation

The functions in this module take the same parameters and return the same values as their counterparts in pattern.en. Refer to the documentation there for more details.  

Gender

Italian nouns and adjectives inflect according to gender. The gender() function predicts the gender (MALE, FEMALEPLURAL) of a given noun with about 92% accuracy: 

>>> from pattern.it import gender, MALE, FEMALE, PLURAL
>>> print gender('gatti')

(MALE, PLURAL)

Article

The article() function returns the article (INDEFINITE or DEFINITE) inflected by gender (e.g., il gatto → i gatti).

>>> from pattern.it import article, DEFINITE, MALE, PLURAL
>>> print article('gatti', DEFINITE, gender=(MALE, PLURAL))

i

Noun singularization & pluralization

For Italian nouns there is singularize() and pluralize(). The implementation is slightly less robust than the English version (accuracy 84% for singularization and 93% for pluralization).

>>> from pattern.it import singularize, pluralize
>>>  
>>> print singularize('gatti')
>>> print pluralize('gatto')

gatto
gatti 

Verb conjugation

For Italian verbs there is conjugate(), lemma(), lexeme() and tenses(). The lexicon for verb conjugation contains about 1,250 common Italian verbs, mined from Wiktionary. For unknown verbs it will fall back to a rule-based approach with an accuracy of about 86%. 

Italian verbs have more tenses than English verbs. In particular, the plural differs for each person, and there are additional forms for the FUTURE tense, the IMPERATIVE, CONDITIONAL and SUBJUNCTIVE mood and the PERFECTIVE aspect:

>>> from pattern.it import conjugate
>>> from pattern.it import INFINITIVE, PRESENT, PAST, SG, SUBJUNCTIVE, PERFECTIVE
>>>  
>>> print conjugate('sono', INFINITIVE)
>>> print conjugate('sono', PRESENT, 1, SG, mood=SUBJUNCTIVE)
>>> print conjugate('sono', PAST, 3, SG) 
>>> print conjugate('sono', PAST, 3, SG, aspect=PERFECTIVE) 

essere
sia
era 
fu   

For PAST tense + PERFECTIVE aspect we can also use PRETERITE (passato remoto) For PAST tense + IMPERFECTIVE aspect we can also use IMPERFECT (imperfetto).

>>> from pattern.it import conjugate
>>> from pattern.it import IMPERFECT, PRETERITE
>>>  
>>> print conjugate('sono', IMPERFECT, 3, SG)
>>> print conjugate('sono', PRETERITE, 3, SG)

era
fu   

 The conjugate() function takes the following optional parameters:

Tense Person Number Mood Aspect Alias Example
INFINITVE None None None None "inf" essere
PRESENT 1 SG INDICATIVE IMPERFECTIVE "1sg" io __sono__
PRESENT 2 SG INDICATIVE IMPERFECTIVE "2sg" tu __sei__
PRESENT 3 SG INDICATIVE IMPERFECTIVE "3sg" lui __è__
PRESENT 1 PL INDICATIVE IMPERFECTIVE "1pl" noi __siamo__
PRESENT 2 PL INDICATIVE IMPERFECTIVE "2pl" voi __siete__
PRESENT 3 PL INDICATIVE IMPERFECTIVE "3pl" loro __sono__
PRESENT None None INDICATIVE PROGRESSIVE "part" essendo
 
PRESENT 2 SG IMPERATIVE IMPERFECTIVE "2sg!" sii
PRESENT 3 SG IMPERATIVE IMPERFECTIVE "3sg!" sia
PRESENT 1 PL IMPERATIVE IMPERFECTIVE "1pl!" siamo
PRESENT 2 PL IMPERATIVE IMPERFECTIVE "2pl!" siate
PRESENT 3 PL IMPERATIVE IMPERFECTIVE "3pl!" siano
 
PRESENT 1 SG SUBJUNCTIVE IMPERFECTIVE "1sg?" io __sia__
PRESENT 2 SG SUBJUNCTIVE IMPERFECTIVE "2sg?" tu __sia__
PRESENT 3 SG SUBJUNCTIVE IMPERFECTIVE "3sg?" lui __sia__
PRESENT 1 PL SUBJUNCTIVE IMPERFECTIVE "1pl?" noi __siamo__
PRESENT 2 PL SUBJUNCTIVE IMPERFECTIVE "2pl?" voi __siate__
PRESENT 3 PL SUBJUNCTIVE IMPERFECTIVE "3pl?" loro __siano__
 
PAST 1 SG INDICATIVE IMPERFECTIVE "1sgp" io __ero__
PAST 2 SG INDICATIVE IMPERFECTIVE "2sgp" tu __eri__
PAST 3 SG INDICATIVE IMPERFECTIVE "3sgp" lui __era__
PAST 1 PL INDICATIVE IMPERFECTIVE "1ppl" noi __e____ravamo__
PAST 2 PL INDICATIVE IMPERFECTIVE "2ppl" voi __eravate__
PAST 3 PL INDICATIVE IMPERFECTIVE "3ppl" loro __erano__
PAST None None INDICATIVE PROGRESSIVE "ppart" stato
 
PAST 1 SG INDICATIVE PERFECTIVE "1sgp+" io __fui__
PAST 2 SG INDICATIVE PERFECTIVE "2sgp+" tu __fosti__
PAST 3 SG INDICATIVE PERFECTIVE "3sgp+" lui __fu__
PAST 1 PL INDICATIVE PERFECTIVE "1ppl+" noi __fummo__
PAST 2 PL INDICATIVE PERFECTIVE "2ppl+" voi __foste__
PAST 3 PL INDICATIVE PERFECTIVE "3ppl+" loro __furono__
 
PAST 1 SG SUBJUNCTIVE IMPERFECTIVE "1sgp?" io __fossi__
PAST 2 SG SUBJUNCTIVE IMPERFECTIVE "2sgp?" tu __fossi__
PAST 3 SG SUBJUNCTIVE IMPERFECTIVE "3sgp?" lui __fosse__
PAST 1 PL SUBJUNCTIVE IMPERFECTIVE "1ppl?" noi __fossimo__
PAST 2 PL SUBJUNCTIVE IMPERFECTIVE "2ppl?" voi __foste__
PAST 3 PL SUBJUNCTIVE IMPERFECTIVE "3ppl?" loro __fossero__
 
FUTURE 1 SG INDICATIVE IMPERFECTIVE "1sgf" io __sarò__
FUTURE 2 SG INDICATIVE IMPERFECTIVE "2sgf" tu __sarai__
FUTURE 3 SG INDICATIVE IMPERFECTIVE "3sgf" lui __sarà__
FUTURE 1 PL INDICATIVE IMPERFECTIVE "1plf" noi __saremo__
FUTURE 2 PL INDICATIVE IMPERFECTIVE "2plf" voi __sarete__
FUTURE 3 PL INDICATIVE IMPERFECTIVE "3plf" loro __saranno__
 
CONDITIONAL 1 SG INDICATIVE IMPERFECTIVE "1sg->" io __sarei__
CONDITIONAL 2 SG INDICATIVE IMPERFECTIVE "2sg->" tu __saresti__
CONDITIONAL 3 SG INDICATIVE IMPERFECTIVE "3sg->" lui __sarebbe__
CONDITIONAL 1 PL INDICATIVE IMPERFECTIVE "1pl->" noi __saremmo__
CONDITIONAL 2 PL INDICATIVE IMPERFECTIVE "2pl->" voi __sareste__
CONDITIONAL 3 PL INDICATIVE IMPERFECTIVE "3pl->" loro __sarebbero__

Instead of optional parameters, a single short alias, or PARTICIPLE or PAST+PARTICIPLE can also be given. With no parameters, the infinitive form of the verb is returned.

Attributive & predicative adjectives 

Italian adjectives inflect with suffixes -o → -i (masculine) and -a → -e (feminine), with some exceptions  (e.g., grande → i grandi felini). You can get the base form with the predicative() function. A statistical approach is used with an accuracy of 88%.

>>> from pattern.it import attributive
>>> print predicative('grandi') 

grande  

Parser

For parsing there is parse(), parsetree() and split(). The parse() function annotates words in the given string with their part-of-speech tags (e.g., NN for nouns and VB for verbs). The parsetree() function takes a string and returns a tree of nested objects (Text → Sentence → Chunk → Word). The split() function takes the output of parse() and returns a Text. See the pattern.en documentation (here) how to manipulate Text objects. 

>>> from pattern.it import parse, split
>>>  
>>> s = parse('Il gatto nero faceva le fusa.')
>>> for sentence in split(s):
>>>     print sentence

Sentence('Il/DT/B-NP/O gatto/NN/I-NP/O nero/JJ/I-NP/O'
         'faceva/VB/B-VP/O'
         'le/DT/B-NP/O fusa/NN/I-NP/O ././O/O')

The parser is mined from Wiktionary. The accuracy is around 92%.

Sentiment analysis

There's no sentiment() function for Italian yet.