v1.5.0: Alpha support for Swedish and Hungarian

ines released this 27 Dec 21:20

· 12270 commits to master since this release

✨ Major features and improvements

NEW: Alpha support for Swedish tokenization.
NEW: Alpha support for Hungarian tokenization.
Update language data for Spanish tokenization.
Speed up tokenization when no data is preloaded by caching the first 10,000 vocabulary items seen.

🔴 Bug fixes

List the language_data package in the setup.py.
Fix missing vec_path declaration that was failing if add_vectors was set.
Allow Vocab to load without serializer_freqs.

📖 Documentation and examples

NEW: spaCy Jupyter notebooks repo: ongoing collection of easy-to-run spaCy examples and tutorials.
Fix issue #657: Generalise dependency parsing annotation specs beyond English.
Fix various typos and inconsistencies.

👥 Contributors

Thanks to @oroszgy, @magnusburton, @jmizgajski, @aikramer2, @fnorf and @bhargavvader for the pull requests!

Assets 2