Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistake in simplemma.py #12

Closed
hivaze opened this issue Jul 3, 2022 · 1 comment
Closed

Mistake in simplemma.py #12

hivaze opened this issue Jul 3, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@hivaze
Copy link

hivaze commented Jul 3, 2022

The crash passed like a knife in the back, if you pass an empty string to the lemmatization or knowledge test method, then you get such an error, please at least make an assert with the corresponding message if you want this to be an exception.

~/miniconda3/envs/text/lib/python3.8/site-packages/simplemma/simplemma.py in is_known(token, lang)
    362     lang = _update_lang_data(lang)
    363     for language in LANG_DATA:
--> 364         if _simple_search(token, language.dict) is not None:
    365             return True
    366     return False

~/miniconda3/envs/text/lib/python3.8/site-packages/simplemma/simplemma.py in _simple_search(token, datadict, initial)
    206     if candidate is None:
    207         # try upper or lowercase
--> 208         if token[0].isupper():
    209             candidate = datadict.get(token.lower())
    210         else:

IndexError: string index out of range
@adbar adbar added the enhancement New feature or request label Jul 4, 2022
adbar added a commit that referenced this issue Jul 4, 2022
@adbar
Copy link
Owner

adbar commented Jul 4, 2022

Hi @hivaze, I would have expected users to control the input type but it's not difficult to add a corresponding function. Thanks for your feedback!

@adbar adbar closed this as completed Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants