The Apache OpenNLP library provides binary models for processing of natural language text. This repository is intended for the distribution of model files as a Maven artifacts.
For additional information, visit the OpenNLP Home Page
You can use OpenNLP with any language, further demo models are provided here.
The models are fully compatible with the latest release, they can be used for testing or getting started.
Please train your own models for all other use cases.
Documentation, including JavaDocs, code usage and command-line interface examples are available here
You can also follow our mailing lists for news and updates.
We provide Tokenizer, Sentence Detector and Part-of-Speech Tagger models for the following languages:
- Bulgarian
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- German
- Italian
- Latvian
- Norwegian
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovak
- Slovenian
- Spanish
- Swedish
- Ukrainian
These models are compatible with OpenNLP >= 1.0.0
. Model details are available here.
In addition, we provide a Language Detector, which is able to detect 103 languages in ISO 693-3 standard. Works well with longer texts that have at least 2 sentences or more from the same language.
It is compatible with OpenNLP >= 1.8.3
. Model details are available here.
You can import a model artifact directly via Maven, SBT or Gradle, for instance:
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-models-langdetect</artifactId>
<version>${opennlp.models.version}</version>
</dependency>
libraryDependencies += "org.apache.opennlp" % "opennlp-models-langdetect" % "${opennlp.version}"
compile group: "org.apache.opennlp", name: "opennlp-models-langdetect", version: "${opennlp.version}"
For more details please check our documentation
Ensure to add a new model to the expected-models.txt
file located in opennlp-models-test
.
The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.
If you would like to get involved please follow the instructions here