- lightblue is a multi-lingual CCG parser with DTS representations
- Copyright owner: Daisuke Bekki and Bekki Laboratory
In Linux:
$ wget -qO- https://get.haskellstack.org/ | sh
In Mac:
$ brew install haskell-stack
See https://docs/haskellstack.org/en/stable/README/#how-to-install for details.
One of the following Japanese morphological analyzers must be installed before executing lightblue.
The following English morphological analyzers must be installed before executing lightblue.
Do the following in the directory under which you'd like to install lightblue.
$ git clone --depth=1 https://github.com/DaisukeBekki/lightblue.git
This operation will create, under the current directory, a new directory lightblue. Henceforth we will refer to the full path to this directory as <lightblue>.
You need to add the environment variable LIGHTBLUE and set its value as <lightblue>. You may add the line export LIGHTBLUE=<lightblue>
to .bashrc, .bash.profile, .bash_profile, or whatever configuration file for your shell. Then move to <lightblue> and do the following:
$ cd <lightblue>
$ stack build
Set the permission of the shell scripts lightblue
to executable.
$ chmod 755 lightblue
To parse a Japanese sentence and get a parsing result in a text format, execute:
$ echo 太郎がパンを食べた。 | ./lightblue jp parse -s text
To parse an English sentence and get a parsing result, execute:
$ echo John loves Mary. | ./lightblue en parse -s text
To see a parsing result in HTML formal, execute (choose your browser):
$ echo 太郎がパンを食べた。 | ./lightblue jp parse -s html > result.html; firefox result.html
If you have a text file (one sentence per line) <corpusfile>, then you can feed its path to lightblue by:
$ ./lightblue jp parse -s html -f <corpusfile>
To parse a JSeM file and execute inferences therein, then you can feed it to lightblue by:
$ ./lightblue jp jsem -f <jsemfile>
The syntax of the lightblue command is as follows:
./lightblue <lang> <lang's local options> <command> <command's local options> <global options>
Lang | |
---|---|
jp |
Japanese |
en |
English (no local options) |
Local Options for jp |
Default | Description |
---|---|---|
-m or --ma {juman|jumanpp|kwja} |
kwja |
Specify morphological analyzer |
--filter {knp|kwja|none} |
none |
Specify node filter |
Command | |
---|---|
parse |
Parse Sentences and returns parsing results. |
jsem |
Parse a JSeM file and execute inferences. |
numeration |
Shows the list of lexical items prepared for parsing the given sentence |
version |
Print the lightblue version. |
stat |
Print the lightblue statistics. |
Each of parse
and jsem
commands has a set of local options.
Local Options for parse |
Default | Description |
---|---|---|
-o or --output {tree|postag} |
tree |
Specify the output content.tree : Outputs parse trees and their type check results.postag : Outputs only lexical items (Use lightblue a part-of-speech tagger) |
Local Options for jsem |
Default | Description |
---|---|---|
--jsemid <text> |
all |
Skip JSeM data the JSeM ID of which is not equial to this value. |
--nsample <int> |
-1 |
Specify a number of JSeM data to process (A negative value means all data) |
The global options are common to all commands.
Global Options | Default | Description |
---|---|---|
-s or --style {text|tex|xml|html} |
text |
Show results in the specified format. |
-p or --prover {Wani|Null} |
Wani |
Choose a prover.Wani : Use Wani prover (Daido and Bekki 2020)None : Use the null prover (that always returns no diagrams). |
-f or --file <filepath> |
Read input texts from (Specify '-' to use stdin) |
|
-b or --beam <int> |
32 |
Set the beam width to |
--nparse <int> |
1 |
Search only N-best parse trees for each sentence (A negative value means all trees) |
--ntypecheck <int> |
1 |
Search only N-best diagrams for each type checking of a logical form (A negative value means all diagrams) |
--nproof <int> |
-1 |
Search only N-best diagrams for each proof search (A negative value means all diagrams) |
--maxdepth <int> |
5 |
Set the maximum search depth in proof search (default: 9) |
--maxtime <int> |
10000 |
Set the maximum search time in proof search (default: 10000) |
--noTypeCheck |
If specified, show no type checking diagram for each sentence. | |
--noInference |
If specified, execute no inference for each discourse. | |
--time |
Show the execution time in stderr. | |
--verbose |
Show type infer/check logs in stderr. |
Installing Haskell-mode for Emacs will help.
$ sudo apt install haskell-mode
The following command creates an HTML document at: <lightblue>/haddock/doc/html/lightblue/index.html
$ stack build --haddock
- Repo owner: Daisuke Bekki