-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some characters in stopwords_tr do not appear Turkish character #15
Comments
Thanks for finding this! The “encoding bit” for a character (object) in R can only be one of “Unknown”, “UTF-8”, or Latin-1, but all of the stopwords values should be UTF-8, so we just need to correct the mis-encoded stop words. I will fix this if you send me the corrections for the words that need it. Thanks! |
@erkanozhan pls see #16, does that solve it? |
I'm very glad for your answer. I'm making corrections. I'il send it in a short time. I prepared an excel table. Where can I send the file to you? |
kbenoit@lse.ac.uk but much better to use this tool https://www.tablesgenerator.com/markdown_tables so you can paste the Markdown here. I just really need the wrong words + their corrections, e.g.
|
I've added fixes to #16 . |
Some characters in stopwords_tr are not in Turkish. For example;
I'm looking for a way to fix them.
stopwords_tr$word<-gsub("ý","ı",stopwords_tr$word)
The result has not changed. I tried these, but it didn't.
Another interesting thing.
When you double-click stopwords_tr in R Studio to display it, the character appears "ý". In Console, it looks like "y".
Is there a parameter to set encoding?
Thanks to everyone.
The text was updated successfully, but these errors were encountered: