Facing some issues with ignore_non_words #54

AnandDev8 · 2019-08-20T11:58:18Z

Hi,

I am currently working on a chat bot use case where i found symspellpy very useful but i am facing some issues with "ignore_non_words" parameter of lookup_compound.
I need a specific pattern like an account no xx004453 to be ignored by spell checker and it kinda of works as well.

My regex is made to satisfy patterns like starts with 2 or 3 alphabets and then numbers or hyphens.

My issue is as follows

je3453 -> jeff hi (it is converting these numbers)[Happens to small numbers only like xx123]
xx1234-5678-1234 -> xx1234 5678 1234 (all the '-' are removed)

How to solve these issue ?

mammothb · 2019-08-21T00:28:19Z

ignore_non_words is a boolean flag, regex is used for ignore_token which is available for lookup and word_segmentation. In your case, could you do some preprocessing and filter out the words which match your regex pattern before feed them into symspellpy?

pdahale95 mentioned this issue Dec 11, 2019

Feat: Add split_by_space and match_any_term_with_digits #66

Merged

mammothb closed this as completed Apr 8, 2020

mammothb linked a pull request Nov 30, 2021 that will close this issue

Feat: Add split_by_space and match_any_term_with_digits #66

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Facing some issues with ignore_non_words #54

Facing some issues with ignore_non_words #54

AnandDev8 commented Aug 20, 2019

mammothb commented Aug 21, 2019

Facing some issues with ignore_non_words #54

Facing some issues with ignore_non_words #54

Comments

AnandDev8 commented Aug 20, 2019

My regex is made to satisfy patterns like starts with 2 or 3 alphabets and then numbers or hyphens.

mammothb commented Aug 21, 2019