Releases: KorAP/KorAP-Tokenizer
Releases · KorAP/KorAP-Tokenizer
KorAP-Tokenizer-2.2.5
- released on maven central
- more ossrh sync data to maven pom added
- minor code cleanups
- some API documentation added
KorAP-Tokenizer 2.2.3
- Updated dependencies
- Minimum Java version raised to 17
- Fixed group id in pom.xml
- Removed compile dependency on Maven Surefire
- Build artifacts in src/main/jflex are now ignored by git
- java.io's ByteArrayOutputStream used instead of 3rd-party class
KorAP-Tokenizer v2.2.2
2.2.2
- Bug fix: a single quotation mark at the beginning of a word
is no longer interpreted as a beginning of an omission, but as quotation mark token. - dependencies updated
2.2.1 (unreleased)
- "du." is no longer treated as an abbreviation.
KorAP-Tokenizer v2.2.0
Updates
- Apostrophe and hyphen marked contractions and clitics in English (I've, isn't, Peter's, …) and French (j'ai, d'un, l'art, sont-elles, …) are now separated again.
KorAP-Tokenizer v2.1.0
Changes in v2.1.0
- GitHub CI test workflow added
- Dependencies updated
-Xss2m
added to maven jvm config
Potentially breaking change
--sentence-boundaries|-s
now prints sentence boundaries only if--positions|-p
is also present