-
Notifications
You must be signed in to change notification settings - Fork 2
Parsing
There are three techniques for parsing a single observation.
Load an IRTG via the file dialog. Then use the "Parse" option under tools in the grammar window. This should bring up a window that allows you to type in the input that you want parsed - one per algebra. If you do not want to specify the input for a given algebra, then simply leave the field blank. How the different types of algebra values are written down is specified on the (codec page)[Codec].
Once the input has been parsed a (tree automaton window)[TreeAutomatonWindow] for a tree automaton that contains all the parse trees will open.
Assume that you have your input objects given as a list of strings and an IRTG called "irtg". Convert the string inputs into the actual objects that the underlying algebras can understand. You can achieve this for each input string "s" by calling:
#!java
Object actualInputObject = irtg.parseString(interpretationName,inputString);
"interpretationName" is the name of interpretation that contains the algebra which is used to read the inputString, according to the formats explained on the (codec page)[Codec]. The interpretation name must be known to the IRTG you are using. You can then put the objects into a map "representations" from "interpreationName"s to "actualInputObject"s. Finally you can parse this input as follows:
#!java
TreeAutomaton parseChart = irtg.parse(representations);
Often you will want to parse a whole list of inputs. For this there are options for parsing a whole collection of data.
Load an IRTG via the file dialogue. There is a "Bulk Parse" option under the "Tools" dialogue. If you choose this option then you will be asked to select a file that is written according to the corpus codec. Once you have chosen a corpus, you are then asked to select a file in which the parsing results are stored. Once bulk parsing is finished, the target file will contain a corpus in which a parse is associated with each corpus entry. This will be (one of) the highest weight parse(s).