go-corenlp
is a Golang wrapper for Stanford CoreNLP.
Download and install it:
go get github.com/nongdenchet/go-corenlp
Make sure that you can run Stanford CoreNLP on command line:
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -h
A simple code for using go-corenlp
is:
package main
import (
"fmt"
"github.com/nongdenchet/go-corenlp" // exposes "corenlp"
"github.com/nongdenchet/go-corenlp/connector"
)
func main() {
// sample text from https://stanfordnlp.github.io/CoreNLP/
text := `President Xi Jinping of Chaina, on his first state visit to the United States, showed off his familiarity with American history and pop culture on Tuesday night.`
// LocalExec connector is responsible to run Stanford CoreNLP process.
c := connector.NewLocalExec(nil)
c.JavaArgs = []string{"-Xmx4g"} // set Java params
c.ClassPath = os.Getenv("CORE_NLP") // set Java class path
c.Annotators = []string{"tokenize", "ssplit", "pos", "lemma", "ner"}
// Annotate text
doc, err := Annotate(c, text)
if err != nil {
panic(err)
}
// Output words and pos
fmt.Println("----- Tokens -----")
for _, sentence := range doc.Sentences {
for _, token := range sentence.Tokens {
fmt.Printf("%s(%s)%s\n", token.Word, token.Pos, token.After)
}
}
// Output entity mentions
fmt.Println("\n----- Entity Mentions -----")
for _, sentence := range doc.Sentences {
for _, token := range sentence.EntityMentions {
fmt.Printf("%s - %s\n", token.Text, token.Ner)
}
}
}
Output:
----- Tokens -----
President(NNP)
Xi(NN)
Jinping(NN)
of(IN)
Chaina(NNP)
,(,)
on(IN)
his(PRP$)
first(JJ)
state(NN)
visit(NN)
to(TO)
the(DT)
United(NNP)
States(NNPS)
,(,)
showed(VBD)
off(IN)
his(PRP$)
familiarity(NN)
with(IN)
American(JJ)
history(NN)
and(CC)
pop(NN)
culture(NN)
on(IN)
Tuesday(NNP)
night(NN)
.(.)
----- Entity Mentions -----
President - TITLE
Xi Jinping - PERSON
Chaina - LOCATION
first - ORDINAL
United States - COUNTRY
American - NATIONALITY
Tuesday - DATE
night - TIME
his - PERSON
his - PERSON
// Annotate text
doc, err := corenlp.Annotate(connector.NewLocalExec(nil), text)
if err != nil {
panic(err)
}
// First sentence
sentence := doc.Sentences[0]
// RawParse contains text-based result of Parser annotator
fmt.Println(sentence.RawParse) // => (ROOT (S (NP (NP (NNP President)...
// Parse() returns go's struct of Parser annotator
parse, _ := sentence.Parse()
fmt.Printf("%v\n", parse.Pos) // => ROOT
// Tokenizer, PosTagger
for _, token := range sentence.Tokens {
fmt.Printf("%s(%s)%s", token.Word, token.Pos, token.After)
}
// Dependencies
for _, dep := range sentence.Dependencies {
fmt.Printf("%s => (%s) => %s\n", dep.GovernorGloss, dep.Dep, dep.DependentGloss)
}
go-corenlp
supports a timeout by using context.Context
.
ctx, cancel := context.WithTimeout(context.Background(), 120*time.Second)
defer cancel()
c := connector.NewLocalExec(ctx)
doc, err := corenlp.Annotate(c, text)
To connect CoreNLP server, You may use HTTPClient provider
.
ctx, cancel := context.WithTimeout(context.Background(), 120*time.Second)
defer cancel()
c := connector.NewHTTPClient(ctx, "http://127.0.0.1:9000/")
c.Username = "username"
c.Password = "password"
doc, err := corenlp.Annotate(c, text)
To use ParseOutput
method, You can parse the output file which is generated by Stanford CoreNLP.
For example. If you run following command
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit -file input.txt --outputFormat json
The output file input.txt.json
will be generated, So you can parse it as below.
rawjson, err := ioutil.ReadFile("input.txt.json")
if err != nil {
panic(err)
}
doc, err := ParseOutput(rawjson)
MIT