- Add support for a tokenizer for splitting words into tokens
- Add full python 2.7 support for foreign dictionaries
- Ensure all checks against the word frequency are lower case
- Slightly better performance on edit distance of 2
- Minor package fix for non-wheel deployments
- Ignore case for language identifiers
- Changed
words
function tosplit_words
to differentiate with theword_frequency.words
function - Added Portuguese dictionary:
pt
- Add encoding argument to
gzip.open
andopen
dictionary loading and exporting - Use of slots for class objects
- Remove words based on threshold
- Add ability to iterate over words (keys) in the dictionary
- Add setting to to reduce the edit distance check see PR #17 Thanks @mrjamesriley
- Added Export functionality:
- json
- gzip
- Updated logic for loading dictionaries to be either language or local_dictionary
- Ability to easily remove words
- Ability to add a single word
- Improved (i.e. cleaned up) English dictionary
- Better handle punctuation and numbers as the word to check
- Add support for language dictionaries
- English, Spanish, French, and German
- Remove support for python 2; if it works, great!
- Move word frequency to its own class
- Add basic tests
- Readme documentation
- Initial release using code from Peter Norvig
- Initial release to pypi