Skip to content

stanfordnlp/python-corenlp-protobuf

Repository files navigation

Stanford CoreNLP Python Bindings

https://travis-ci.org/stanfordnlp/python-corenlp-protobuf.svg?branch=master

This package contains python bindings for Stanford CoreNLP's protobuf specifications, as generated by protoc. These bindings can used to parse binary data produced by, e.g., the Stanford CoreNLP server.


Usage:

from corenlp_protobuf import Document, parseFromDelimitedString

# document.dat contains a serialized Document.
with open('document.dat', 'r') as f:
  buf = f.read()
doc = Document()
parseFromDelimitedString(doc, buf)

# You can access the sentences from doc.sentence.
sentence = doc.sentence[0]

# You can access any property within a sentence.
print(sentence.text)

# Likewise for tokens
token = sentence.token[0]
print(token.lemma)

See test_read.py for more examples.