Skip to content
This repository has been archived by the owner on Feb 10, 2021. It is now read-only.
/ nlp-ws-doc Public archive

Natural language processing webservice for polish language

Notifications You must be signed in to change notification settings

applicaai/nlp-ws-doc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

NLP WS (version 1.0.0)

Natural language processing webservice for polish language

  1. Overview
  2. Webservice
  3. Simple query
  4. Extended query
  5. Contact

Overview

The webservice was designed to enrich information about texts using nlp processing. For this purpuse we built a pipeline with the following steps:

  1. Tokenization - to split text into sentences and words
  2. Morphological analysis - to define all grammar possibilities of a given word
  3. POS Tagging - to disambiguate grammar categories of a given word

Our tool uses:

  • grammar categories developed in nkjp project. All grammar categories can be found in the following book
  • lexicon for morphological analysis developed in Applica company based on polimorf

Webservice

Simple query

  • Url: https://nlp.applica.pl/ams-ws-nlp/rest/nlp/simple
  • Input [application/json]: {"message":{"body":"Tekst do przetworzenia."},"token":"applica_token"}
  • Output [plain/text]: tekst do przetworzyć .
  • Curl: curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"message":{"body":"Tekst do przetworzenia."},"token":"applica_token"}' "https://nlp.applica.pl/ams-ws-nlp/rest/nlp/simple"

Extended query

  • Url: https://nlp.applica.pl/ams-ws-nlp/rest/nlp/extended
  • Input [application/json]: {"message":{"body":"Tekst do przetworzenia."},"token":"applica_token"}
  • Output [plain/text]: {"sentIdx":[1,1,1,1],"base":["tekst","do","przetworzyć","."],"cTag":["subst","prep","ger","interp"],"nps":[true,false,false,true],"orth":["Tekst","do","przetworzenia","."]}
  • Output details: json contains details for processed text in a format of dictionary
  • orth - list of words in original form
  • base - list of base forms (lemma) for each word
  • cTag - list of grammar categories for each word
  • nps - list of flags for each word that inform, if in an original text was space before a word
  • sentIdx - list of sentence indexes for each word
  • Curl: curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"message":{"body":"Tekst do przetworzenia."},"token":"applica_token"}' "https://nlp.applica.pl/ams-ws-nlp/rest/nlp/extended"

Contact

Write an email to applica in case of:

  • troubleshooting
  • obtain applica_token

About

Natural language processing webservice for polish language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published