Can't get it work with Russian #7

houshuang · 2013-10-28T18:51:02Z

Can't get this to work, not sure if it's a UTF8 issue or what.

require 'ffi/hunspell'
c= FFI::Hunspell.dict('ru_RU')
p c.stem("рассчитывал") #-> []

command line using hunspell binary:
textmining|master⚡ ⇒ echo рассчитывал | hunspell -d ru_RU -s
рассчитывал рассчитывать

nkrot · 2016-08-04T11:02:28Z

Works for me (my locale is UTF-8)

require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')

dict.valid? "рассчитывал"
#=> true 

dict.encoding
#=> #<Encoding:UTF-8> 

dict.stem "рассчитывал"
#=> ["рассчитывать"]

postmodern · 2016-12-04T06:41:55Z

@houshuang what does __ENCODING__ return in irb? What is the output of the locale command?

Envek · 2017-07-27T11:42:36Z

Yeah, this is encoding problems:

On Ubuntu 17.04 (hunspell 1.4.1-2build1):

dict = FFI::Hunspell.dict('ru_RU')
dict.encoding
# => #<Encoding:KOI8-R (autoload)>

dict.suggest('ощибка')
# => []

dict.suggest('ощибка'.encode(dict.encoding)).map { |s| s.encode(__ENCODING__) }
# => ["ощипка", "ошибка"]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't get it work with Russian #7

Can't get it work with Russian #7

houshuang commented Oct 28, 2013

nkrot commented Aug 4, 2016

postmodern commented Dec 4, 2016

Envek commented Jul 27, 2017

Can't get it work with Russian #7

Can't get it work with Russian #7

Comments

houshuang commented Oct 28, 2013

nkrot commented Aug 4, 2016

postmodern commented Dec 4, 2016

Envek commented Jul 27, 2017