Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get it work with Russian #7

Open
houshuang opened this issue Oct 28, 2013 · 3 comments
Open

Can't get it work with Russian #7

houshuang opened this issue Oct 28, 2013 · 3 comments

Comments

@houshuang
Copy link

Can't get this to work, not sure if it's a UTF8 issue or what.

require 'ffi/hunspell'
c= FFI::Hunspell.dict('ru_RU')
p c.stem("рассчитывал") #-> []

command line using hunspell binary:
textmining|master⚡ ⇒ echo рассчитывал | hunspell -d ru_RU -s
рассчитывал рассчитывать

@nkrot
Copy link

nkrot commented Aug 4, 2016

Works for me (my locale is UTF-8)

require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')

dict.valid? "рассчитывал"
#=> true 

dict.encoding
#=> #<Encoding:UTF-8> 

dict.stem "рассчитывал"
#=> ["рассчитывать"]

@postmodern
Copy link
Owner

@houshuang what does __ENCODING__ return in irb? What is the output of the locale command?

@Envek
Copy link

Envek commented Jul 27, 2017

Yeah, this is encoding problems:

On Ubuntu 17.04 (hunspell 1.4.1-2build1):

dict = FFI::Hunspell.dict('ru_RU')
dict.encoding
# => #<Encoding:KOI8-R (autoload)>

dict.suggest('ощибка')
# => []

dict.suggest('ощибка'.encode(dict.encoding)).map { |s| s.encode(__ENCODING__) }
# => ["ощипка", "ошибка"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants