A tool for evaluating sentences using an N-gram language model. The model can be trained using a text corpus. Alternatively, a pretrained model can be loaded. The evaluating script checks the sentences for out-of-vocabulary (OOV) words and highligh them with <UNK> mark. In addition, it highlights uncommon N-grams. N-grams which have log-probability below the predefined theshold are supposed to be uncommon. By default, the threshold is set to -infinity, which corresponds to a log-probability of an unseen N-gram.
https://drive.google.com/open?id=1vh1SMU1g5Wd7xsRH8UoHVlOk50K9vSqH
"Oavsett regleringen i gymnasieskolans styrdokument att beakta samtliga kommunikativa delfärdigheter, saknas det muntliga testet fortfarande i den finländska studentexamen."
Text to evaluate: Oavsett regleringen i gymnasieskolans styrdokument att beakta samtliga kommunikativa delfärdigheter, saknas det muntliga testet fortfarande i den finländska studentexamen.
(For human evaluator) Text tagged with part-of-speech tags: [[('Oavsett', 'ADJ'), ('regleringen', 'NOUN'), ('i', 'ADP'), ('gymnasieskolans', 'NOUN'), ('styrdokument', 'NOUN'), ('att', 'PART'), ('beakta', 'VERB'), ('samtliga', 'ADJ'), ('kommunikativa', 'ADJ'), ('delfärdigheter', 'NOUN'), (',', 'PUNCT'), ('saknas', 'VERB'), ('det', 'DET'), ('muntliga', 'ADJ'), ('testet', 'NOUN'), ('fortfarande', 'ADV'), ('i', 'ADP'), ('den', 'DET'), ('finländska', 'ADJ'), ('studentexamen', 'NOUN'), ('.', 'PUNCT')]]
Analyzed text with annotations: oavsett regleringen*1 i gymnasieskolans* styrdokument att*2 beakta samtliga kommunikativa*3 delfärdigheter* , saknas det muntliga testet*4 fortfarande*5 i den finländska studentexamen .
Unknown words:
- gymnasieskolans. Similar words: gymnasie, gymnasieskola. You can also try to replace the word with some noun.
- delfärdigheter. Similar words: färdigheter, färdig, färdighet. You can also try to replace the word with some noun.
Uncommon ngrams:
- oavsett regleringen. Similar ngrams: sett reglering. You used the noun regleringen. You can also try to use some other noun instead of regleringen.
- styrdokument att. Similar ngrams: dokumentärfilmen att, dokumenten att, dokumentera att, dokumentär att, dokument att. You used the particle att. Try to use another part of speech, for example adposition.
- samtliga kommunikativa. Similar ngrams: samt kommunikativa. You can also try to use some other adjective instead of kommunikativa.
- muntliga testet. Similar ngrams: egentliga testerna, offentliga protesterna, offentliga missiltestet, ordentliga test. You can also try to use some other noun instead of testet.
- testet fortfarande. Similar ngrams: protesterar fortfarande, testas fortfarande, testerna fortfarande. You can also try to use some other adverb instead of fortfarande.