Skip to content

apache/opennlp-models

Welcome to Apache OpenNLP Models!

GitHub license Twitter Follow

The Apache OpenNLP library provides binary models for processing of natural language text. This repository is intended for the distribution of model files as a Maven artifacts.

Useful Links

For additional information, visit the OpenNLP Home Page

You can use OpenNLP with any language, further demo models are provided here.

The models are fully compatible with the latest release, they can be used for testing or getting started.

Please train your own models for all other use cases.

Documentation, including JavaDocs, code usage and command-line interface examples are available here

You can also follow our mailing lists for news and updates.

Overview

We provide Tokenizer, Sentence Detector and Part-of-Speech Tagger models for the following languages:

  • Bulgarian
  • Croatian
  • Czech
  • Danish
  • Dutch
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Italian
  • Latvian
  • Norwegian
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish
  • Swedish
  • Ukrainian

These models are compatible with OpenNLP >= 1.0.0. Model details are available here.

In addition, we provide a Language Detector, which is able to detect 103 languages in ISO 693-3 standard. Works well with longer texts that have at least 2 sentences or more from the same language.

It is compatible with OpenNLP >= 1.8.3. Model details are available here.

Getting Started

You can import a model artifact directly via Maven, SBT or Gradle, for instance:

Maven

<dependency>
    <groupId>org.apache.opennlp</groupId>
    <artifactId>opennlp-models-langdetect</artifactId>
    <version>${opennlp.models.version}</version>
</dependency>

SBT

libraryDependencies += "org.apache.opennlp" % "opennlp-models-langdetect" % "${opennlp.version}"

Gradle

compile group: "org.apache.opennlp", name: "opennlp-models-langdetect", version: "${opennlp.version}"

For more details please check our documentation

Adding a new Model

Ensure to add a new model to the expected-models.txt file located in opennlp-models-test.

Contributing

The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.

If you would like to get involved please follow the instructions here