Fix tokenising when using using more than just a-zA-Z #37
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously:
Händler
would be tokenized asndler
orändler
depending on python versionRather than the expected
händler
Solution: use
regexp
rather thanre
.This gives us the ability to use unicode character clasess such as
[[:upper:]]
and[[:lower:]]
Fixes #35
I'm usually a ruby developer not a python developer I don't know how to get the regex library working on 2.7 or how to compare the test strings in a unicode-aware way (they're different on my mac vs on travis, if one passes the other fails)
But it mostly works