Releases: Kensuke-Mitsuzawa/JapaneseTokenizers
Releases · Kensuke-Mitsuzawa/JapaneseTokenizers
Possible to call jumandic and unidic
For mecab wrapper class, you could call
- jumandic. It's alternative dictionary for mecab tokenizer. jumandic has rich morphological information.
- unidic. It's continuously maintained by NINJA. See more information here, JP only.
And, in this version, some arguments in mecab wrapper class is deleted because it's not consistent anymore.
Cleaned up type hint
Merge pull request #54 from Kensuke-Mitsuzawa/enhancement/#53 cleaned up type hint
Bug fix for Python3.7 / latest pyknp package
- could not install some packages in setup.py because
pip.main
function is removed. Now setup.py callssubprocess.check_call
function instead. - pyknp package is updated and Jumanpp module is removed in the latest version. Now, JapaneseTokenizer package calls the latest pyknp
Issue in MacOS
The following issue is fixed, which is specific to MacOS, I guess.
#47
1.3.6: Merge pull request #45 from Kensuke-Mitsuzawa/bug/#44
fixed issued in #44
improved for using jumanpp
Fixed this issue -> #39
unified py2/py3 modules
fixed this issue #36
Bugs in filtering
Bug fix for specific case
There is a case which mecab + neologd returns a 10th-additional field. That causes value error inside a process of mecab-wrapper module. Detailed info. in #28
Fast call of Jumanpp
- Jumanpp interface works faster than the previous version.
- This is only for UNIX distributions. Windows OS is an exception.