You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I love learning tidytext but was a bit surprised to see that the get_sentiments() function does not allow to use the non-english translations included within the Nov 2017 nrc lexicon v.092 xlsx file used by tidytext(english words are in column A, and are translated in dozens of languages from columns B to DA while DB to DK list the polarity and sentiment scores for each word). It would be amazing to add an argument to define which language (column) to use from the nrc lexicon i.e lang="French".
Thanks,
Leonard
The text was updated successfully, but these errors were encountered:
The NRC-Emotion-Lexicon.zip file that is currently downloaded via the function in the textdata package does include that .xlsx file you are mentioning. Using these translations is within the permission we have from the lexicon creators, although of course translated sentiment lexicons can be less reliable.
@EmilHvitfeldt do you want to consider this in textdata?
Thank you for your answers, great to know using the translations is within the permissions from the lexicon creators. I concur that using translated lexicons is less reliable than a natively created one. However, (i) for analyses comparing corpora spanning across different languages a single lexicon would be more reliable than a patchwork of different lexicons (ii) many languages, spoken by millions of people still lack reliable native lexicons. Thanks
Hi,
I love learning tidytext but was a bit surprised to see that the get_sentiments() function does not allow to use the non-english translations included within the Nov 2017 nrc lexicon v.092 xlsx file used by tidytext(english words are in column A, and are translated in dozens of languages from columns B to DA while DB to DK list the polarity and sentiment scores for each word). It would be amazing to add an argument to define which language (column) to use from the nrc lexicon i.e lang="French".
Thanks,
Leonard
The text was updated successfully, but these errors were encountered: