-
Notifications
You must be signed in to change notification settings - Fork 2
Language Specific Application
Infer sentiment attitudes from text file with further D3JS
-based demo launch:
python3 -m arelight.run.infer \
--sampling-framework "arekit" \
--ner-model-name "ner_ontonotes_bert_mult" \
--ner-types "ORG|PERSON|LOC|GPE" \
--terms-per-context 50 \
--sentence-parser "nltk:russian" \
--text-b-type "nli_m" \
--tokens-per-context 128 \
--bert-framework "opennre" \
--batch-size 10 \
--stemmer "mystem" \
--pretrained-bert "DeepPavlov/rubert-base-cased" \
--bert-torch-checkpoint "ra4-rsr1_DeepPavlov-rubert-base-cased_cls.pth.tar" \
--backend "d3js_graphs" \
-o "output" \
--from-files "<PATH-TO-TEXT-FILE>"
Sentiment Analysis Pipeline:
ARElight core is powered by AREkit framework,
responsible for raw text sampling.
To annotate objects in text, we use BERT
-based models trained on
OntoNotes5
(powered by DeepPavlov)
For relations annotation, we support
OpenNRE
BERT
models.
The default inference is pretrained BERT with transfer learning based on
RuSentRel
and
RuAttitudes
collections, that were sampled and translated into English via
arekit-ss.
It is possible to utilize google-trans
API wrapper to launch inference from any language by transfering the knowledge towards the specific model in a following way and additional translation flags:
python3 -m arelight.run.infer \
... # LIST OF PARAMETERS FROM YOUR PAST SCRIPT
--translate-framework "googletrans" \
--translate-entity "en:ru" \
--translate-text "en:ru"
NOTE: We separate translation of words in text and entities (the reason is to support different language for entities).