Skip to content

Releases: KorAP/KorAP-Tokenizer

KorAP-Tokenizer-2.2.5

08 Sep 14:30
Compare
Choose a tag to compare

KorAP-Tokenizer 2.2.3

07 Sep 16:42
Compare
Choose a tag to compare
  • Updated dependencies
  • Minimum Java version raised to 17
  • Fixed group id in pom.xml
  • Removed compile dependency on Maven Surefire
  • Build artifacts in src/main/jflex are now ignored by git
  • java.io's ByteArrayOutputStream used instead of 3rd-party class

KorAP-Tokenizer v2.2.2

17 Jan 08:38
Compare
Choose a tag to compare

2.2.2

  • Bug fix: a single quotation mark at the beginning of a word
    is no longer interpreted as a beginning of an omission, but as quotation mark token.
  • dependencies updated

2.2.1 (unreleased)

  • "du." is no longer treated as an abbreviation.

KorAP-Tokenizer v2.2.0

29 Jul 07:43
Compare
Choose a tag to compare

Updates

  • Apostrophe and hyphen marked contractions and clitics in English (I've, isn't, Peter's, …) and French (j'ai, d'un, l'art, sont-elles, …) are now separated again.

KorAP-Tokenizer v2.1.0

29 Jun 09:53
Compare
Choose a tag to compare

Changes in v2.1.0

  • GitHub CI test workflow added
  • Dependencies updated
  • -Xss2m added to maven jvm config

Potentially breaking change

  • --sentence-boundaries|-s now prints sentence boundaries only if --positions|-p is also present