Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing some issues with ignore_non_words #54

Closed
AnandDev8 opened this issue Aug 20, 2019 · 1 comment · Fixed by #66
Closed

Facing some issues with ignore_non_words #54

AnandDev8 opened this issue Aug 20, 2019 · 1 comment · Fixed by #66

Comments

@AnandDev8
Copy link

Hi,

I am currently working on a chat bot use case where i found symspellpy very useful but i am facing some issues with "ignore_non_words" parameter of lookup_compound.
I need a specific pattern like an account no xx004453 to be ignored by spell checker and it kinda of works as well.

My regex is made to satisfy patterns like starts with 2 or 3 alphabets and then numbers or hyphens.

My issue is as follows

  • je3453 -> jeff hi (it is converting these numbers)[Happens to small numbers only like xx123]
  • xx1234-5678-1234 -> xx1234 5678 1234 (all the '-' are removed)

How to solve these issue ?

@mammothb
Copy link
Owner

ignore_non_words is a boolean flag, regex is used for ignore_token which is available for lookup and word_segmentation. In your case, could you do some preprocessing and filter out the words which match your regex pattern before feed them into symspellpy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants