Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support additional languages #104

Closed
thisandagain opened this issue Feb 10, 2017 · 5 comments
Closed

Support additional languages #104

thisandagain opened this issue Feb 10, 2017 · 5 comments

Comments

@thisandagain
Copy link
Owner

thisandagain commented Feb 10, 2017

From PR #93:

We'll need to establish a set of labeled sentences (see #103 as well as the original AFINN paper) in order be able to determine the accuracy of each translation. While Google Translate is great, sentiment can be pretty subtle and so I wonder if something like Pootle or Transifex might be able to help us source translations of the AFINN word list with greater confidence.

I've been thinking about ways to fund a CrowdFlower campaign to handle establishing labeled corpora for different languages. I don't think it would take much (maybe $60 - $120 USD) per language to get things started.

/cc @dkocich

@EmilStenstrom
Copy link

Here's a fork with a translated version of AFINN: https://github.com/AlexGustafsson/sentiment-swedish

@kubawolanin
Copy link

And here's a fork with a Polish version of AFINN: https://github.com/kubawolanin/sentiment-polish 😉
Thank you for your amazing work!

@thisandagain
Copy link
Owner Author

Amazing! Thanks for sharing @EmilStenstrom and @kubawolanin. 😄

@Ekliptor
Copy link

What exactly needs to be changed to add a language?
I'm thinking about adding German.

If it's just a matter of translating the build/AFINN.json file: Wouldn't it be better to keep it all in one project instead of forking it? And then adding an API call so the user can set the language when loading the library.

@thisandagain
Copy link
Owner Author

thisandagain commented Jan 15, 2018

@Ekliptor Good question. I think in order to feel good about adding an additional language I think we would need the translated build/AFINN.json file plus a labeled corpora (such as the labeled dataset we have of Amazon, IMDB, and Yelp reviews from UCI) that permit us to evaluate the accuracy of those translations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants