-
Notifications
You must be signed in to change notification settings - Fork 0
/
ccg.tex
570 lines (501 loc) · 28.1 KB
/
ccg.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
\chapter{Formalism Conversion} \label{chp:conversion}
\begin{center}
\textit{
A preliminary version of this chapter appeared as \textcite{Kummerfeld-etal:2012:ACL}.
}
\end{center}
In the previous chapter, we explored error analysis of \ptb parser output, developing a new tool and applying it to understand the mistakes systems make.
However, not all parsers are designed to produce \ptb-style phrase structure parses; many have been developed for other grammar formalisms.
In our effort to understand the relative performance of different parsers, we turned to the task of automatic parse conversion to enable comparison of systems across formalisms.
In this chapter, we propose an improved, bottom-up method for converting one syntactic representation, \ccg, into another, \ptb.
In contrast with past work \parencite{Clark-Curran:2009}, which used simple rules based on category pairs, our approach follows the generalizations of \ccg, assigning richer rules to individual categories and defining general methods of combining them for each of the \ccg combinators.
Our conversion preserves more sentences under round-trip conversion (51.1\% \myvs 39.6\%) and is more robust.
In particular, unlike past methods, ours does not require ad-hoc rules over non-local features, and so could be integrated into a parser.
\section{Background}
There has been extensive work on converting parser output for evaluation, \myeg
\textcite{Lin:1998} and \textcite{Briscoe-Carroll-Graham-Copestake:2002} proposed
using underlying dependencies for evaluation. There has also been work on
conversion to phrase structure, from dependencies \parencite{Xia:2001,Xia:2009} and
from lexicalized formalisms, \myeg \hpsg \parencite{Matsuzaki-Tsujii:2008} and \mytag
\parencite{Chiang:2000,Sarkar:2001}. Our focus is on \ccg to \ptb conversion
\parencite{Clark-Curran:2009}.
\subsection{Combinatory Categorial Grammar (\ccg)}
The lower half of Figure~\ref{fig:ccg-example} shows a \ccg derivation
\parencite{Steedman:2000} in which each word is assigned a {\em category}, and
{\em combinatory rules} are applied to adjacent categories until only one
remains. Categories can be atomic, \myeg the \cf{N} assigned to
\textit{magistrates}, or complex functions of the form {\em result / arg}, where
{\em result} and {\em arg} are categories and the slash indicates the argument's
directionality. Combinators define how adjacent categories can combine.
Figure~\ref{fig:ccg-example} uses {\em function application}, where a complex
category consumes an adjacent argument to form its result, \myeg \cf{S[dcl]\bs
NP} combines with the \cf{NP} to its left to form an \cf{S[dcl]}. More
powerful combinators allow categories to combine with greater flexibility.
\begin{figure}
\small
\begin{tikzpicture}[every text node part/.style={align=center}]
\node (w0) at (0, 0) {\strut Italian \\ \strut $\cf{N/N}$};
\node (w1) [right=2ex of w0.east] {\strut magistrates \\ \strut $\cf{N}$};
\node (w2) [right=2ex of w1.east] {\strut labeled \\ \strut $\cf{((S[dcl]\bs NP)/NP)/NP}$};
\node (w3) [right=2ex of w2.east] {\strut his \\ \strut $\cf{NP[nb]/N}$};
\node (w4) [right=2ex of w3.east] {\strut death \\ \strut $\cf{N}$};
\node (w5) [right=2ex of w4.east] {\strut a \\ \strut $\cf{NP[nb]/N}$};
\node (w6) [right=2ex of w5.east] {\strut suicide \\ \strut $\cf{N}$};
% PTB above
\node (nt03) [above=3ex of w3.north] {PRP\$};
\node (nt04) [above=3ex of w4.north] {NN};
\node (nt05) [above=3ex of w5.north] {DT};
\node (nt06) [above=3ex of w6.north] {NN};
\path (nt03.north) -- node[above=3ex] (nt13-14) {NP} (nt04.north);
\path (nt05.north) -- node[above=3ex] (nt15-16) {NP} (nt06.north);
\path (nt13-14.north) -- node[above=3ex] (nt23-26) {S} (nt15-16.north);
\node (nt20) at (w0 |- nt23-26) {JJ};
\node (nt21) at (w1 |- nt23-26) {NNS};
\node (nt22) at (w2 |- nt23-26) {VBD};
\path (nt20.north) -- node[above=3ex] (nt30-31) {NP} (nt21.north);
\path (nt22.north) -- node[above=3ex] (nt32-36) {VP} (nt23-26.north);
\path (nt30-31.north) -- node[above=3ex] (nt40-46) {S} (nt32-36.north);
\draw (w0.north) -- (nt20.south);
\draw (w1.north) -- (nt21.south);
\draw (w2.north) -- (nt22.south);
\draw (w3.north) -- (nt03.south);
\draw (w4.north) -- (nt04.south);
\draw (w5.north) -- (nt05.south);
\draw (w6.north) -- (nt06.south);
\draw (nt03.north) -- (nt13-14.south) -- (nt04.north);
\draw (nt05.north) -- (nt15-16.south) -- (nt06.north);
\draw (nt13-14.north) -- (nt23-26.south) -- (nt15-16.north);
\draw (nt20.north) -- (nt30-31.south) -- (nt21.north);
\draw (nt22.north) -- (nt32-36.south) -- (nt23-26.north);
\draw (nt30-31.north) -- (nt40-46.south) -- (nt32-36.north);
% CCG below
\node (c00L) at (w0.south west) {};
\node (c00R) at (w1.south east) {};
\draw [-{Straight Barb[length=2mm]}] (c00L) -- node[below=0ex] (c00) {\strut $\cf{N}$} (c00R);
\node (c01L) at (w3.south west) {};
\node (c01R) at (w4.south east) {};
\draw [-{Straight Barb[length=2mm]}] (c01L) -- node[below=0ex] (c01) {\strut $\cf{NP[nb]}$} (c01R);
\node (c02L) at (w5.south west) {};
\node (c02R) at (w6.south east) {};
\draw [-{Straight Barb[length=2mm]}] (c02L) -- node[below=0ex] (c02) {\strut $\cf{NP[nb]}$} (c02R);
\node (c10L) at (c00L |- c00.south) {};
\node (c10R) at (c00R |- c00.south) {};
\draw (c10L) -- node[below=0ex] (c10) {\strut $\cf{NP}$} (c10R);
\node (c11L) at (w2.west |- c01.south) {};
\node (c11R) at (w4.east |- c01.south) {};
\draw [-{Straight Barb[length=2mm]}] (c11L) -- node[below=0ex] (c11) {\strut $\cf{(S[dcl]\bs NP)/NP}$} (c11R);
\node (c20L) at (w2.west |- c11.south) {};
\node (c20R) at (w6.east |- c11.south) {};
\draw [-{Straight Barb[length=2mm]}] (c20L) -- node[below=0ex] (c20) {\strut $\cf{S[dcl]\bs NP}$} (c20R);
\node (c30L) at (w0.west |- c20.south) {};
\node (c30R) at (w6.east |- c20.south) {};
\draw [-{Straight Barb[reversed,length=2mm]}] (c30L) -- node[below=0ex] (c30) {\strut $\cf{S[dcl]}$} (c30R);
\end{tikzpicture}
\caption[An example of \ccg and \ptb parses with nodes covering crossing spans.]{ \label{fig:ccg-example}
An example of \ccg and \ptb parses with nodes covering crossing spans: \emph{his death a suicide} (\ptb) and \emph{labeled his death} (CCGbank).
}
\end{figure}
We cannot form a \ptb tree by simply relabeling the categories in a \ccg derivation for two reasons.
First, the mapping from categories to phrase labels is many-to-many, so it would be difficult to determine the correct label (though other researchers have tried, as discussed in Section~\ref{sec-learned-conv}).
Second, there are some spans that occur only in the \ccg derivation and others that occur only in the \ptb parse, \myeg in Figure~\ref{fig:ccg-example} there is a node for \emph{his death a suicide} in \ptb but not \ccg, and vice versa for \emph{labeled his death}.
This second issue is particularly difficult because in some cases, including the example given, the differing nodes in the two parses cover spans of the sentence that cross\footnote{This means a local insertion or deletion of a node cannot make the necessary changes, there would have to be several simultaneous changes to go from one structure to the other.}.
These differences are the result of conscious decisions made in the construction of CCGbank, in many cases enabling the derivation to encode extra dependencies that are either implicit or expressed via traces in the \ptb.
\subsection{\textcite{Clark-Curran:2009}}
\begin{table}[t!]
\centering
\begin{tabular}{l|l}
\hline
Categories & Schema \\
\hline\hline
\cf{N} & create an NP \\
\cf{((S[dcl]\bs NP)/NP)/NP} & create a VP \\
\hline
\cf{N/N} + \cf{N} & place left under right \\
\cf{NP[nb]/N} + \cf{N} & place left under right \\
\cf{((S[dcl]\bs NP)/NP)/NP} + \cf{NP} & place right under left \\
\cf{(S[dcl]\bs NP)/NP} + \cf{NP} & place right under left \\
\cf{NP} + \cf{S[dcl]\bs NP} & place both under S\\
\hline
\end{tabular}
\caption[Example rules from \textcite{Clark-Curran:2009}.]{ \label{fig:candc09}
Example \old lexical and rule schemas.
}
\end{table}
\textcite{Clark-Curran:2009}, hereafter \old, assign a {\em schema} to each
leaf (lexical category) and rule (pair of combining categories) in the \ccg derivation.
The \ptb tree is constructed from the \ccg bottom-up, creating leaves with
lexical schemas, then merging/adding sub-trees using rule schemas at each step.
The schemas for Figure~\ref{fig:ccg-example} are shown in Table~\ref{fig:candc09}.
These apply to create NPs over \textit{magistrates}, \textit{death}, and
\textit{suicide}, and a VP over \textit{labeled}, and then combine the trees by
placing one under the other at each step, and finally create an S node at the
root.
\old has sparsity problems, requiring schemas for all valid pairs of
categories --- at a minimum, the 2853 unique category combinations found in
CCGbank. \textcite{Clark-Curran:2009} create schemas for only 776 of these,
handling the remainder with approximate catch-all rules.
\old only specifies one simple schema for each rule (pair of
categories). This appears reasonable at first, but frequently causes
problems, \myeg:
\begin{center}
\cf{(N/N)/(N/N)} + \cf{N/N} \\
\begin{tabular}{ll}
(1) & ``more than'' + ``30'' \\ % should form a QP
(2) & ``relatively'' + ``small'' \\ % should from an ADJP
\end{tabular}
%%%% ((N/N)/(N/N))\(S[adj]\NP) IN than
%%%% (N/N)/(N/N) RB relatively
\end{center}
Here either a QP bracket (1) or an ADJP bracket (2) should be created. Since
both examples involve the same rule schema, \old would incorrectly process them
in the same way. To combat the most glaring errors, \old manipulates the \ptb
tree with ad-hoc rules based on non-local features over the \ccg nodes
being combined --- an approach that cannot be easily integrated into a
parser.
These disadvantages are a consequence of failing to exploit the generalizations that \ccg combinators define.
We return to this example below to show how our approach does exploit those generalizations and thereby handles both cases correctly.
\subsection{\textcite{zhang-zhao-hui:2012:DEMOS}} \label{sec-learned-conv}
\textcite{zhang-zhao-hui:2012:DEMOS} considered a statistical approach to conversion from \ccg to \ptb.
Their system works by classifying each step in the \ccg derivation as either being a \ptb node or being a dummy node, to be flattened in the final structure.
This approach has the limitation that it cannot handle sentences where the structure of the \ccg derivation is missing nodes that exist in the structure of the \ptb parse, such as the example in Figure~\ref{fig:ccg-example}.
These cases account for 10\% of sentences, but the method is still fairly effective because they are able to recover most of the structure for such sentences.
\section{Our Approach}
\begin{figure}
\centering
\includegraphics[width=0.85\linewidth]{figures/ccg-example}
\caption[An example function application during \ccg to \ptb conversion.]{ \label{fig:inst-example}
An example function application.
Top row: \ccg rule.
Bottom row: applying instruction (VP f a).
}
\end{figure}
\begin{table}
\centering
\begin{tabular}{lll}
\hline
Symbol & Meaning & Example Instruction \\
\hline
\hline
(X f a) & Add an X bracket around functor and argument & (VP$\;$ f$\;$ a) \\
\{ \} & Flatten enclosed node & (N$\;$ f$\;$ \{a\}) \\
X* & Use same label as argument or default to X & (S*$\;$ f$\;$ \{a\}) \\
f$_i$ & Place subtrees & (PP$\;$ f$_0$$\;$ (S$\;$ f$_{1..k}$$\;$ a)) \\
\hline
\end{tabular}
\caption[Types of operations in instructions in our \ccg to \ptb conversion.]{\label{tab:operators}
Types of operations in instructions.
}
\end{table}
Our conversion assigns a set of instructions to each lexical category and defines generic operations for each combinator that combine instructions.
Figure~\ref{fig:inst-example} shows an example of a common form of instruction \emph{(VP f a)}, which specifies three things:
\begin{itemize}
\item Create a new VP node, indicated by the VP and brackets
\item Place the \ptb parse for the functor under the new node, on the left, indicated by the position of the f
\item Place the \ptb parse for the argument under the new node, on the right, indicated by the position of the a
\end{itemize}
Table~\ref{tab:operators} shows all of the operators we define.
For convenience, in this description we will refer to the parses coming from the functor and argument as sub-parses.
The first operator is the one described above, though note that it can be more powerful, constructing any set of \ptb nodes indicated with standard bracket notation plus the two markers for the functor and argument.
The \{~\} operator indicates that a flattened version of that sub-parse should be placed, in the example in the table the top node in the sub-parse for the argument would be deleted so its children are placed instead.
The * operator takes the label for the new node from the argument sub-parse, \myeg the result of an \cf{S\bs S_1} should correspond to the label of argument 1 (SINV, SBAR, etc, which all appear as \cf{S} in \ccg).
This is necessary because CCGbank uses additional annotations on categories to distinguish cases that are assigned entirely different labels in the \ptb.
The final operator enables breaking up children in a sub-parse and placing them at different locations in the structure being formed.
This only makes sense when creating multiple nodes, a PP and an S in the example, and uses subscripts to indicate which sub-parse to place where.
Finally, complex categories with multiple arguments are assigned a list of instructions, one per argument.
The lists of rules are processed by following the \ccg derivation, taking actions depending on the type of combinator used:
\vspace{5mm}
\noindent
\textbf{Lexical Assignment} The initial instruction list for a category is taken from our hand-annotated set.
\vspace{3mm}
\noindent
\begin{minipage}[c]{0.6\textwidth}
\textbf{Function Application} The next rule in the functor's list is applied and any remaining rules in its list are retained as the new rule set (any rules remaining in the list for the argument are discarded).
\end{minipage}\hfill
\begin{minipage}[c]{0.3\textwidth}
\zerodisplayskips
\begin{align*}
\cf{X / Y} \quad & \quad \cf{Y} & \Rightarrow & \quad \cf{X} \\
\cf{Y} \quad & \quad \cf{X \bs Y} & \Rightarrow & \quad \cf{X}
\end{align*}
\end{minipage}
\vspace{3mm}
\noindent
\begin{minipage}[c]{0.6\textwidth}
\textbf{Function Composition} The unapplied instructions of the argument are combined with the remaining steps for the functor.
\end{minipage}\hfill
\begin{minipage}[c]{0.3\textwidth}
\zerodisplayskips
\begin{align*}
\cf{X / Y} \quad & \quad \cf{Y / Z} & \Rightarrow & \quad \cf{X / Z} \\
\cf{Y \bs Z} \quad & \quad \cf{X \bs Y} & \Rightarrow & \quad \cf{X \bs Z}
\end{align*}
\end{minipage}
\vspace{3mm}
\noindent
\begin{minipage}[c]{0.6\textwidth}
\textbf{Crossed Composition}
Only backwards crossed composition is used (the lower of the two).
The new instruction list is composed of the top rule from the left, followed by all but the top rule from the right.
\end{minipage}\hfill
\begin{minipage}[c]{0.3\textwidth}
\zerodisplayskips
\begin{align*}
\cf{X / Y} \quad & \quad \cf{Y \bs Z} & \Rightarrow & \quad \cf{X \bs Z} \\
\cf{Y / Z} \quad & \quad \cf{X \bs Y} & \Rightarrow & \quad \cf{X / Z}
\end{align*}
\end{minipage}
\vspace{3mm}
\noindent
\begin{minipage}[c]{0.6\textwidth}
\textbf{Type Raising}
A new list of instructions is taken from our hand-annotated set.
If necessary, the current structure is flattened to prevent a unary production of identical symbols being formed when the next instruction is applied.
\end{minipage}\hfill
\begin{minipage}[c]{0.3\textwidth}
\zerodisplayskips
\begin{align*}
\cf{X} & \Rightarrow & \quad \cf{T / (T \bs X)} \\
\cf{X} & \Rightarrow & \quad \cf{T \bs (T / X)}
\end{align*}
\end{minipage}
\vspace{3mm}
\noindent
\textbf{Coordination}
In \ccg, coordination is a single step, but for implementation in parsing it is typically broken into two steps.
We follow the derivation, doing the core structural construction in the second step.
The instructions retained after the second step are from the right conjunct.
\vspace{3mm}
\noindent
\textbf{Functional Substitution}
These combinators occur rarely in the treebank and are not used in current parsers, so we do not implement them.
\vspace{5mm}
Additionally, we vary the instructions assigned based on the \pos tag in 32 cases, and for the word \textit{not}, to recover distinctions not captured by CCGbank categories alone.
In 52 cases the later instructions depend on the structure of the argument being picked up, often to capture different handling of adverbial and adjectival phrases.
For the non-combinatory binary and unary rules in CCGbank we define twenty-eight special instruction sets.
\begin{table}
\centering
\begin{tabular}{lll}
\hline
Category & Instruction set \\
\hline
\hline
\cf{N} & (NP$\;$ f) \\[1pt]
\cf{N/N_1} & (NP$\;$ f$\;$ \{a\}) \\[1pt]
\cf{NP[nb]/N_1} & (NP$\;$ f$\;$ \{a\}) \\[1pt]
\cf{((S[dcl]\bs NP_3)/NP_2)/NP_1} & (VP$\;$ f$\;$ a) \\
& (VP$\;$ \{f\}$\;$ a) \\
& (S$\;$ a$\;$ f) \\
\hline
\end{tabular}
\caption[Example instructions for our \ccg to \ptb conversion.]{ \label{tab:instructions}
Instruction sets for the categories in Figure~\ref{fig:ccg-example}.
}
\end{table}
For the example from the previous section we begin by assigning the instructions shown in Table~\ref{tab:instructions}.
Some of these can apply immediately as they do not involve an argument, \myeg \textit{magistrates} has (NP$\;$~f), producing:
\begin{center}
\scalebox{\inlinederivscale}{
\synttree
[NP
[NNS [magistrates]]]
}
\end{center}
A slightly more complicated instruction is applied to \textit{Italian}: (NP~f~\{a\}).
This creates a new NP bracket, inserts the functor's tree, and flattens and inserts the argument's tree, producing:
\begin{center}
\scalebox{\inlinederivscale}{
\synttree
[NP
[JJ [Italian]]
[NNS [magistrates]]]
}
\end{center}
Similar operations apply for the other NPs, followed by a unary rule, mapping \cf{N} to \cf{NP}.
We handle this by applying one of the special cases for non-combinatory unary operations in CCGbank.
The next three steps all involve instructions from the verb's list.
The verb takes three arguments, and so has three instructions applied as follows:
\begin{center}
\strut
\hfill
\scalebox{\inlinederivscale}{
\synttree
[VP
[VBD [labeled]]
[NP
[PRP\$ [his]]
[NN [death]]]]
}
\hfill
\scalebox{\inlinederivscale}{
\synttree
[VP
[VBD [labeled]]
[NP
[PRP\$ [his]]
[NN [death]]]
[NP
[DT [a]]
[NN [suicide]]]]
}
\hfill
\strut
\\
\strut
\hfill
\scalebox{\inlinederivscale}{
\synttree
[S
[NP
[JJ [Italian]]
[NNS [magistrates]]]
[VP
[VBD [labeled]]
[NP
[PRP\$ [his]]
[NN [death]]]
[NP
[DT [a]]
[NN [suicide]]]]]
}
\hfill
\strut
\end{center}
The final tree in this case is almost correct but omits the S bracket around the two NPs on the right.
To fix it we could have modified \textit{labeled}'s second instruction to move \textit{his death} under a new S via the final symbol in Table~\ref{tab:operators}.
However, for this particular construction the \ptb annotations are inconsistent, and so we cannot recover without introducing more errors elsewhere.
Our approach naturally handles our QP \myvs ADJP example from the previous section because the two cases have different lexical categories:
\begin{center}
\begin{tabular}{lc}
\textit{than} \hspace{1cm} & {\small\cf{((N/N)/(N/N))\bs (S[adj]\bs NP)}} \\
\textit{relatively} \hspace{1cm} & {\small\cf{(N/N)/(N/N)}}
\end{tabular}
\end{center}
This lexical difference means we can assign different instructions to correctly recover the QP and ADJP nodes, whereas \old applies the same schema in both cases because after the first function application for \textit{than}, the category becomes the same as the category for \textit{relatively}.
\section{Evaluation}
Using sections 00-21 of the treebanks, we hand-crafted instructions for 527 lexical categories, a process that took under 100 hours, and includes all the categories used by the \candc parser.
There are 647 further categories and 35 non-combinatory binary rules in sections 00-21 that we did not annotate.
For unannotated categories, we use the instructions of the result category with an added instruction.
\begin{table}
\centering
\begin{tabular}{lc@{\hskip 7.5mm}cccc@{\hskip 7.5mm}cccc}
\hline
& & \multicolumn{4}{c}{Length $\le 40$} & \multicolumn{4}{c}{All lengths} \\
System & Data & P & R & F & Sent. & P & R & F & Sent. \\
\hline
\hline
\multirow{2}{*}{\old}
& 00 & 94.39 & 95.85 & 95.12 & 42.1 & 93.67 & 95.37 & 94.51 & 39.6 \\
& 23 & 94.04 & 95.44 & 94.73 & 41.9 & 93.95 & 95.33 & 94.64 & 39.7 \\
\hline
\multirow{2}{*}{Zhang, et al. (2012)}
& 00 & 97.09 & 95.40 & 96.24 & -- & 96.92 & 94.82 & 95.86 & -- \\
& 23 & 96.69 & 94.79 & 95.73 & -- & 96.67 & 94.77 & 95.71 & -- \\
\hline
\multirow{2}{*}{This work}
& 00 & 96.98 & 96.77 & 96.87 & 53.6 & 96.69 & 96.58 & 96.63 & 51.1 \\
& 23 & 96.57 & 96.21 & 96.39 & 53.8 & 96.49 & 96.11 & 96.30 & 51.4 \\
\hline
\end{tabular}
\caption{\label{tab:conversion-comparison}
\parseval Precision, Recall, F-Score, and exact sentence match for converted
gold \ccg derivations.
}
\end{table}
Table~\ref{tab:conversion-comparison} compares our approach with \old and \textcite{zhang-zhao-hui:2012:DEMOS} on gold \ccg derivations.
The results shown are as reported by \evalb \parencite{Black-etal:1991} using the \textcite{Collins:1997} parameters.
Our approach leads to increases over \old on all metrics of at least 1.1\%, and increases exact sentence match by over 11\% (both absolute).
Many of the remaining errors relate to missing and extra clause nodes and a range of rare structures, such as QPs, NACs, and NXs.
The only other prominent errors are single word spans, \myeg extra or missing ADVPs.
Many of these errors are unrecoverable from CCGbank, either because inconsistencies in the \ptb have been smoothed over or because they are genuine but rare constructions that were lost.
\begin{figure}
\centering
\hfill
\scalebox{0.85}{ \includegraphics[trim={110mm 25mm 10mm 15mm},clip]{figures/heat-converted} }
\hfill
\scalebox{0.85}{ \includegraphics[trim={15mm 0 55mm 40mm},clip]{figures/heat-converted} }
\\
\scalebox{0.85}{
\includegraphics[trim={15mm 0 50mm 40mm},clip]{figures/heat-native}
\hfill
\includegraphics[trim={15mm 0 65mm 40mm},clip]{figures/heat-both}
}
\caption[Heatmaps comparing gold conversion accuracy, \ccg native evaluation, and converted evaluation.]{ \label{fig:scatter_plots}
For each sentence in the treebank, we plot the converted parser output against gold conversion (top), the original parser evaluation against gold conversion (left), and the converted parser output against the original parser evaluation against (right).
A diagonal line indicating $x=y$ is also included.
Top: Most points lie below the diagonal, indicating that the quality of converted parser output (y) is upper bounded by the quality of conversion on gold parses (x).
Left: No clear correlation is present, indicating that the set of sentences that are converted best (on the far right), are not necessarily easy to parse.
Right: In general, accuracy on the native metric is correlated with accuracy after conversion.
}
\end{figure}
\subsection{Parser Comparison}
When we convert the output of a \ccg parser, the \ptb trees that are produced
will contain errors created by our conversion as well as by the parser. In this
section we are interested in comparing parsers, so we need to factor out errors
created by our conversion.
One way to do this is to calculate a projected score (Proj), as the
parser result over the oracle result, but this is a very rough approximation.
Another way is to evaluate only on the 51\% of sentences for which our
conversion from gold \ccg derivations is perfect (Clean). However,
even on this set our conversion introduces errors, as when the parser output differs from the gold derivation, it may
contain categories that are harder to convert.
Parser F-scores are generally higher on Clean, which could mean that this
set is easier to parse, or it could mean that these sentences don't contain
annotation inconsistencies, and so the parsers aren't incorrect for returning
the true parse (as opposed to the one in the \ptb). To test this distinction
we look for correlation between conversion quality and parse difficulty on
another metric. In particular, Figure~\ref{fig:scatter_plots} (bottom left) shows
\ccg labeled dependency performance for the \candc parser \myvs CCGbank
conversion \parseval scores. The lack of a strong correlation, and the spread
on the line $x=100$, supports the theory that these sentences are not
necessarily easier to parse, but rather have fewer annotation inconsistencies.
In the top plot, the y-axis is \parseval on converted \candc parser output.
Conversion quality essentially bounds the performance of the parser.
The few points above the diagonal are mostly short sentences on which the \candc parser uses categories that lead to one extra correct node (a common case is ADVP v ADJP).
The main constructions on which parse errors occur, \myeg PP attachment, are rarely converted incorrectly, and so we expect the number of errors to be cumulative.
The bottom right plot shows three noteworthy properties, (1) in general the two evaluation metrics are correlated, (2) the \ccg evaluation is slightly harsher, with fewer points below the diagonal line than above it\footnote{This is consistent with prior work, and the fact that \ccg makes some additional distinctions, such as between arguments and adjuncts.}, (3) there are many cases on the far right, where the \ccg evaluation is perfect, but conversion mistakes mean the \ptb score is not perfect.
Some sentences are higher in the right plot than the left because there are distinctions in \ccg that are not always present in the \ptb, \myeg the argument-adjunct distinction.
\begin{table}
\centering
\begin{tabular}{lrr|r}
\hline
Sentences & Clean & All & Proj \\
\hline
\hline
\multicolumn{2}{l}{Converted gold \ccg} & & \\
CCGbank & \hspace{0mm}100.0 & \hspace{0mm}96.3 & -- \\
\hline
\multicolumn{2}{l}{Converted \ccg} & & \\
\textcite{Clark-Curran:2007} & 90.9 & 85.5 & 88.8 \\
\textcite{Fowler-Penn:2010} & 90.9 & 86.0 & 89.3 \\
\textcite{Auli-Lopez:2011} & 91.7 & 86.2 & 89.5 \\
\hline
\multicolumn{2}{l}{Native \ptb} & & \\
\textcite{Klein-Manning:2003:ACL} & 89.8 & 85.8 & -- \\
\textcite{Petrov-Klein:2007} & 93.6 & 90.1 & -- \\
\textcite{Charniak-Johnson:2005} & 94.8 & 91.5 & -- \\
\hline
\end{tabular}
\caption[F-scores on section 23 for \ptb parsers and \ccg parsers with their output converted by our method.]{ \label{tab:full-comp}
F-scores on section 23 for \ptb parsers and \ccg parsers with their output converted by our method.
Clean is only on sentences that are converted perfectly from gold \ccg (51\%).
All is over all sentences.
Proj is a projected F-score (All result / CCGbank All result).
}
\end{table}
Table~\ref{tab:full-comp} presents F-scores for three \ptb parsers and three
\ccg parsers (with their output converted by our method). One interesting
comparison is between the \ptb parser of \textcite{Petrov-Klein:2007} and the
\ccg parser of \textcite{Fowler-Penn:2010}, which use the same underlying
parser. The performance gap is partly due to structures in the \ptb that are
not recoverable from CCGbank, but probably also indicates that the split-merge
model is less effective in \ccg, which has far more symbols than the \ptb.
It is difficult to make conclusive claims about the performance of the parsers.
As shown earlier, Clean does not completely factor out the errors
introduced by our conversion, as the parser output may be more difficult to
convert, and the calculation of Proj only roughly factors out the
errors. However, the results do suggest that the performance of the \ccg
parsers is somewhere between the Stanford and Petrov parsers.
\section{Summary}
By exploiting the generalized combinators of the \ccg formalism, we developed a new method of converting \ccg derivations into \ptb-style trees.
Our system is more effective than previous work, increasing exact sentence match by more than 11\% (absolute), and can be directly integrated with a \ccg parser.
The system is available online, see Appendix~\ref{chp:resources}.