-
Notifications
You must be signed in to change notification settings - Fork 11
/
index.html
623 lines (607 loc) · 46 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>Global Wordnet Formats</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
</style>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="">
<meta name="author" content="">
<!-- Bootstrap Core CSS -->
<link href="css/bootstrap.min.css" rel="stylesheet">
<!-- Custom CSS -->
<link href="css/one-page-wonder.css" rel="stylesheet">
<style>
img { width: 150px; float:left; margin: 50px; }
</style>
</head>
<body>
<div class="container">
<header id="title-block-header">
<h1 class="title">Global Wordnet Formats</h1>
</header>
<p><img src="https://globalwordnet.github.io/schemas/img/GWA_logo.png" /></p>
<p>The Global WordNet Association provides three formats for which WordNets can be published and submitted to the ILI. These are as follows:</p>
<ul>
<li><a href="#xml">Lexical Markup Framework compatible XML</a>
<ul>
<li><a href="http://github.com/globalwordnet/schemas/blob/master/example.xml">Example</a></li>
<li><a href="http://globalwordnet.github.io/schemas/WN-LMF-1.3.dtd">DTD</a></li>
</ul></li>
<li><a href="#json">JSON-LD using the lemon Vocabulary</a>
<ul>
<li><a href="http://github.com/globalwordnet/schemas/blob/master/example.json">Example</a></li>
<li><a href="http://globalwordnet.github.io/schemas/wn-json-context-1.3.json">JSON-LD Context</a></li>
<li><a href="http://github.com/globalwordnet/schemas/blob/master/wn-json-schema.json">Schema</a></li>
</ul></li>
<li><a href="#rdf">OntoLex RDF</a>
<ul>
<li><a href="http://github.com/globalwordnet/schemas/blob/master/example.ttl">Example</a></li>
</ul></li>
</ul>
<p>All of these formats are considered equivalent and a converter between them can be used at.</p>
<p>A converter and validator is available at <a href="http://server1.nlp.insight-centre.org/gwn-converter/">http://server1.nlp.insight-centre.org/gwn-converter/</a></p>
<h1 id="xml">XML</h1>
<p>The XML is specified by the following <a href="WN-LMF-1.3.dtd">DTD</a>. An example is given <a href="https://github.com/globalwordnet/schemas/blob/master/example.xml">here</a>:</p>
<p>The first three lines must always be as follows:</p>
<pre><code><?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE LexicalResource SYSTEM "http://globalwordnet.github.io/schemas/WN-LMF-1.3.dtd">
<LexicalResource xmlns:dc="http://purl.org/dc/elements/1.1/"></code></pre>
<p>A file may contain multiple WordNets in different languages:</p>
<p>The following information is required:</p>
<ul>
<li>id: A short name for the resource</li>
<li>label: The full name for the resources</li>
<li>language: Please follow BCP-47, i.e., use a two-letter code if available else a three-letter code</li>
<li>email: Please give a contact email address</li>
<li>license: The license of your resource (please provide URL)</li>
<li>version: A string identifying this version (preferably follow major.minor format)</li>
<li>url: A URL for your project homepage</li>
<li>citation: The paper to cite for this resource</li>
<li>logo: A link to a Logo (Image URL) for this project</li>
</ul>
<p>Extra properties may be included from Dublin core and in addition</p>
<ul>
<li><p>status: The status of the resource, e.g., “valid”, “checked”, “unchecked”</p></li>
<li><p>confidenceScore: A numeric value between 0 and 1 giving the confidence in the correctness of the element.</p>
<pre><code> <Lexicon id="example-en"
label="Example wordnet (English)"
language="en"
email="john@mccr.ae"
license="https://creativecommons.org/publicdomain/zero/1.0/"
version="1.0"
citation="CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)."
url="http://globalwordnet.github.io/schemas/"
dc:publisher="Global Wordnet Association"></code></pre></li>
</ul>
<p>Each word (lexical entry) must have a unique id:</p>
<pre><code> <LexicalEntry id="w1"></code></pre>
<p>The part of speech values are as follows:</p>
<ul>
<li><p>n: Noun</p></li>
<li><p>v: Verb</p></li>
<li><p>a: Adjective</p></li>
<li><p>r: Adverb</p></li>
<li><p>s: Adjective Satellite</p></li>
<li><p>c: Conjunction</p></li>
<li><p>p: Adposition (Preposition, postposition, etc.)</p></li>
<li><p>x: Other (inc. particle, classifier, bound morphemes, determiners)</p></li>
<li><p>u: Unknown</p>
<pre><code> <Lemma writtenForm="grandfather" partOfSpeech="n"/>
<Sense id="example-en-10161911-n-1" synset="example-en-10161911-n"/>
</LexicalEntry>
<LexicalEntry id="w2">
<Lemma writtenForm="paternal grandfather" partOfSpeech="n"/>
<Sense id="example-en-1-n-1" synset="example-en-1-n"></code></pre></li>
</ul>
<p>The set of relations between senses is limited to the following</p>
<ul>
<li><p>antonym: An opposite and inherently incompatible word</p></li>
<li><p>also: See also, a reference of weak meaning</p></li>
<li><p>participle: An adjective that is a participle form a verb</p></li>
<li><p>pertainym: A relational adjective. Adjectives that are pertainyms are usually defined by such phrases as “of or pertaining to” and do not have antonyms. A pertainym can point to a noun or another pertainym</p></li>
<li><p>derivation: A word that is derived from some other word</p></li>
<li><p>domain_topic: Indicates the category of this word</p></li>
<li><p>domain_member_topic: Indicates a word involved in this category described by this word</p></li>
<li><p>domain_region: Indicates the region of this word</p></li>
<li><p>domain_member_region: Indicates a word involved in the region described by this word</p></li>
<li><p>exemplifies: Indicates the usage of this word</p></li>
<li><p>is_exemplified_by: Indicates a word involved in the usage described by this word</p>
<pre><code> <SenseRelation relType="derivation" target="example-en-10161911-n-1"/>
</Sense>
</LexicalEntry>
<LexicalEntry id="w3">
<Lemma writtenForm="pay" partOfSpeech="v"/></code></pre></li>
</ul>
<p>Syntactic behaviour is given as in Princeton WordNet</p>
<pre><code> <SyntacticBehaviour subcategorizationFrame="Somebody ----s" id="intransitive"/>
<SyntacticBehaviour subcategorizationFrame="Somebody ----s somebody" id="transitive"/>
</LexicalEntry></code></pre>
<p>Syntactic behaviour can also be given as part of the lexicon and referred to with the <code>subcat</code> property.</p>
<p>If a synset is already mapped to the ILI please give the ID here. <strong>All synsets must have an ID that starts with ID of the lexicon followed by a dash, e.g., <code>example-en</code> + <code>-</code> + <code>local_synset_id</code></strong>.</p>
<pre><code> <Synset id="example-en-10161911-n" ili="i90287" partOfSpeech="n"
members="example-en-10161911-n-1 example-en-1-n-1">
<Definition>
the father of your father or mother
</Definition></code></pre>
<p>The <code>members</code> property gives the list of senses in order.</p>
<p>The set of relations between synsets is limited to the following:</p>
<p><strong>Princeton WordNet Properties</strong></p>
<ul>
<li><code>hypernym</code>: a concept that is more general than a given concept</li>
<li><code>hyponym</code>: a concept that is more specific than a given concept</li>
<li><code>instance_hypernym</code>: the type of an instance</li>
<li><code>instance_hyponym</code>: an occurrence of something</li>
<li><code>mero_member</code>: concept A is a member of concept B</li>
<li><code>mero_part</code>: concept A is a component of concept B</li>
<li><code>mero_substance</code>: concept A is made of concept B.</li>
<li><code>holo_member</code>: concept B is a member of concept A</li>
<li><code>holo_part</code>: concept B is the whole where concept A is a part</li>
<li><code>holo_substance</code>: concept B is a substance of concept A</li>
<li><code>entails</code>: impose, involve, or imply as a necessary accompaniment or result</li>
<li><code>causes</code>: concept A is an entity that produces an effect or is responsible for events or results of concept B.</li>
<li><code>similar</code>: (of words) expressing closely related meanings</li>
<li><code>attribute</code>: an abstraction belonging to or characteristic of an entity</li>
<li><code>domain_region</code>: a concept which is a geographical / cultural domain pointer of a given concept.</li>
<li><code>domain_topic</code>: a concept which is the scientific category pointer of a given concept.</li>
<li><code>has_domain_region</code>: a concept which is the term in the geographical / cultural domain of a given concept.</li>
<li><code>has_domain_topic</code>: a concept which is a term in the scientific category of a given concept.</li>
<li><code>exemplifies</code>: a concept which is the example of a given concept.</li>
<li><code>is_exemplified_by</code>: a concept which is the type of a given concept.</li>
</ul>
<p><strong>Non-Princeton WordNet Relations</strong></p>
<ul>
<li><p><code>agent</code>: a concept which is typically the one/that who/which does the action denoted by a given concept.</p></li>
<li><p><code>also</code>: a word having a loose semantic relation to another word</p></li>
<li><p><code>anto_converse</code>: word pairs that name or describe a single relationship from opposite perspectives</p></li>
<li><p><code>anto_gradable</code>: word pairs whose meanings are opposite and which lie on a continuous spectrum</p></li>
<li><p><code>anto_simple</code>: word pairs whose meanings are opposite but whose meanings do not lie on a continuous spectrum</p></li>
<li><p><code>antonym</code>: an opposite and inherently incompatible word</p></li>
<li><p><code>attribute</code>: an abstraction belonging to or characteristic of an entity</p></li>
<li><p><code>augmentative</code>: a concept used to refer to generally larger members of a class</p></li>
<li><p><code>be_in_state</code>: a is qualified by B</p></li>
<li><p><code>classified_by</code>: concept B is modified by classifier A when it is counted.</p></li>
<li><p><code>classifies</code>: a concept A used when counting concept B</p></li>
<li><p><code>co_agent_instrument</code>: a concept which is the instrument used by a given concept in an action.</p></li>
<li><p><code>co_agent_patient</code>: a concept which is the patient undergoing an action carried out by a given concept.</p></li>
<li><p><code>co_agent_result</code>: a concept which is the result of an action taken by a given concept.</p></li>
<li><p><code>co_instrument_agent</code>: a concept which carries out an action by using a given concept as an instrument.</p></li>
<li><p><code>co_instrument_patient</code>: a concept which undergoes an action with the use of a given concept as an instrument.</p></li>
<li><p><code>co_instrument_result</code>: a concept which is the result of an action using an instrument of a given concept.</p></li>
<li><p><code>co_patient_agent</code>: a concept which carries out an action a given concept undergoing.</p></li>
<li><p><code>co_patient_instrument</code>: a concept which is used as an instrument in an action a given concept undergoes.</p></li>
<li><p><code>co_result_agent</code>: a concept which takes an action resulting in a given concept.</p></li>
<li><p><code>co_result_instrument</code>: a concept which is used as an instrument in an action resulting in a given concept.</p></li>
<li><p><code>co_role</code>: a concept undergoes an action in which a given concept is involved.</p></li>
<li><p><code>constitutive</code>: core semantic relations that define synsets</p></li>
<li><p><code>derivation</code>: a concept which is a derivationally related form of a given concept.</p></li>
<li><p><code>diminutive</code>: a concept used to refer to generally smaller members of a class</p></li>
<li><p><code>direction</code>: a concept which is the direction of the action or event expressed by a given concept.</p></li>
<li><p><code>domain</code>: a concept which is a Topic, Region or Usage pointer of a given concept.</p></li>
<li><p><code>domain_region</code>: a concept which is a geographical / cultural domain pointer of a given concept.</p></li>
<li><p><code>domain_topic</code>: a concept which is the scientific category pointer of a given concept.</p></li>
<li><p><code>eq_synonym</code>: A and B are equivalent concepts but their nature requires that they remain separate (e.g. Exemplifies)</p></li>
<li><p><code>exemplifies</code>: a concept which is the example of a given concept.</p></li>
<li><p><code>feminine</code>: a concept used to refer to female members of a class</p></li>
<li><p><code>has_augmentative</code>: a concept which has a special concept for generally larger members of its class</p></li>
<li><p><code>has_diminutive</code>: a concept which has a special concept for generally smaller members of its class</p></li>
<li><p><code>has_domain</code>: a concept which is a term of a given Topic, Region or Usage concept.</p></li>
<li><p><code>has_domain_region</code>: a concept which is the term in the geographical / cultural domain of a given concept.</p></li>
<li><p><code>has_domain_topic</code>: a concept which is a term in the scientific category of a given concept.</p></li>
<li><p><code>has_feminine</code>: a concept which has a special concept for female members of its class</p></li>
<li><p><code>has_masculine</code>: a concept which has a special concept for male members of its class</p></li>
<li><p><code>has_young</code>: a concept which has a special concept for young members of its class</p></li>
<li><p><code>holo_location</code>: B is a place located in A</p></li>
<li><p><code>holo_portion</code>: B is an amount/piece/portion of A</p></li>
<li><p><code>holonym</code>: A makes up a part of B</p></li>
<li><p><code>in_manner</code>: B qualifies the manner in which an action or event expressed by A takes place</p></li>
<li><p><code>instrument</code>: a concept which is the instrument necessary for the action or event expressed by a given concept.</p></li>
<li><p><code>involved</code>: a concept which is the action or event a given concept typically involved in.</p></li>
<li><p><code>involved_agent</code>: a concept which is the action done by an agent expressed by a given concept.</p></li>
<li><p><code>involved_direction</code>: a concept which is the action with the direction expressed by a given concept.</p></li>
<li><p><code>involved_instrument</code>: a concept which is typically the action with the instrument expressed by a given concept.</p></li>
<li><p><code>involved_location</code>: a concept which is the event happening in a place expressed by a given concept.</p></li>
<li><p><code>involved_patient</code>: a concept which is the action that the patient expressed by a given concept undergoing.</p></li>
<li><p><code>involved_result</code>: a concept which is the action or event with a result of a given concept comes into existence.</p></li>
<li><p><code>involved_source_direction</code>: a concept which is the action beginning from a place of a given concept.</p></li>
<li><p><code>involved_target_direction</code>: a concept which is the action or event leading to a place expressed by a given concept.</p></li>
<li><p><code>ir_synonym</code>: a concept that means the same except for the style or connotation</p></li>
<li><p><code>is_caused_by</code>: a comes about because of B</p></li>
<li><p><code>is_entailed_by</code>: opposite of entails</p></li>
<li><p><code>is_exemplified_by</code>: a concept which is the type of a given concept.</p></li>
<li><p><code>is_subevent_of</code>: a takes place during or as part of B, and whenever A takes place, B takes place</p></li>
<li><p><code>location</code>: a concept which is the place where the event expressed by a given concept happens.</p></li>
<li><p><code>manner_of</code>: a qualifies the manner in which an action or event expressed by B takes place</p></li>
<li><p><code>masculine</code>: a concept used to refer to male members of a class</p></li>
<li><p><code>mero_location</code>: A is a place located in B</p></li>
<li><p><code>mero_portion</code>: A is an amount/piece/portion of B</p></li>
<li><p><code>meronym</code>: B makes up a part of A</p></li>
<li><p><code>other</code>: any relation not otherwise specified</p></li>
<li><p><code>participle</code>: a concept which is a participial adjective derived from a verb expressed by a given concept.</p></li>
<li><p><code>patient</code>: a concept which is the one/that who/which undergoes a given concept.</p></li>
<li><p><code>pertainym</code>: a concept which is of or pertaining to a given concept.</p></li>
<li><p><code>restricted_by</code>: a relation between nominal (pronominal) B and an adjectival A (quantifier/determiner)</p></li>
<li><p><code>restricts</code>: a relation between an adjectival A (quantifier/determiner) and a nominal (pronominal) B</p></li>
<li><p><code>result</code>: a concept which comes into existence as a result of a given concept.</p></li>
<li><p><code>role</code>: a concept which is involved in the action or event expressed by a given concept.</p></li>
<li><p><code>secondary_aspect_ip</code>: a concept which is linked to another through a change in aspect (ip)</p></li>
<li><p><code>secondary_aspect_pi</code>: a concept which is linked to another through a change in aspect (pi)</p></li>
<li><p><code>simple_aspect_ip</code>: a concept which is linked to another through a change from imperfective to perfective aspect</p></li>
<li><p><code>simple_aspect_pi</code>: a concept which is linked to another through a change from perfective to imperfective aspect</p></li>
<li><p><code>source_direction</code>: a concept which is the place from where the event expressed by a given concept begins.</p></li>
<li><p><code>state_of</code>: B is qualified by A</p></li>
<li><p><code>subevent</code>: B takes place during or as part of A, and whenever B takes place, A takes place</p></li>
<li><p><code>target_direction</code>: a concept which is the place where the action or event expressed by a given concept leads to.</p></li>
<li><p><code>young</code>: a concept used to refer to young members of a class</p>
<pre><code> <SynsetRelation relType="hypernym" target="example-en-10162692-n"/>
</Synset></code></pre></li>
</ul>
<p>If you wish to define a new concept call the concept “in” (ILI New). If there is no mapping to the ILI leave this field empty (it is required).</p>
<pre><code> <Synset id="example-en-1-n" ili="in" partOfSpeech="n">
<Definition>A father's father; a paternal grandfather</Definition></code></pre>
<p>You can include metadata (such as source) at many points The ILI Definition must be at least 20 characters or five words</p>
<pre><code> <ILIDefinition dc:source="https://en.wiktionary.org/wiki/farfar">
A father's father; a paternal grandfather
</ILIDefinition>
</Synset></code></pre>
<p>You must include all targets of relations</p>
<pre><code> <Synset id="example-en-10162692-n" ili="i90292" partOfSpeech="n"/>
</Lexicon>
<Lexicon id="example-sv"
label="Example wordnet (Swedish)"
language="sv"
email="john@mccr.ae"
license="https://creativecommons.org/publicdomain/zero/1.0/"
version="1.0"
citation="CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)."
url="http://globalwordnet.github.io/schemas/"
dc:publisher="Global Wordnet Association"></code></pre>
<p>The list of lexical entries (words) in your wordnet</p>
<pre><code> <LexicalEntry id="w4">
<Lemma writtenForm="farfar" partOfSpeech="n"/></code></pre>
<p>Synsets need not be language-specific but senses must be</p>
<pre><code> <Sense id="example-sv-2-n-1" synset="example-en-1-n">
<SenseExample dc:source="Europarl Corpus">
Jag vill berätta för er att min farfar var svensk beredskapssoldat vid norska gränsen under andra världskriget, ett krig som Sverige stod utanför
</SenseExample>
</Sense>
</LexicalEntry>
</Lexicon>
</LexicalResource></code></pre>
<p><strong><em>Pronunciation</em></strong></p>
<p>Since 2021, the schema has the ability to represent the pronunciation of lemmas.</p>
<p>This is in the <code><Pronunciation></code> element, which gives the IPA text. It has the following attributes: * <code>variety</code> uses the IETF language tags to indicate dialect, for example encoding British English in IPA as <code>en-GB-fonipa</code> * <code>notation</code>: can encode further information such as indicating a particular dialect (this was <code>notes</code> in the paper) * <code>phonemic</code>: indicates whether the transcription is phonemic (‘true’) or phonetic (<code>false</code>), defaulting to ‘false’ * <code>audio</code>: gives the URL of an audio file of the pronuncation</p>
<p>An example of encoding is given below:</p>
<pre><code> <LexicalEntry id="ex-rabbit-n">
<Lemma writtenForm="rabbit" partOfSpeech="n"/>
<Pronunciation variety="en-GB-fonxsamp en-US-fonxsamp"
audio ="https://path/rabbit.flac">'r\{bIt</Pronunciation>
<Pronunciation variety="en-AU-fonxsamp" notation="weak vowel merger"
audio ="https://path/rabbit1.flac">'r\{b@t</Pronunciation>
</Lemma>
</LexicalEntry></code></pre>
<p><strong>Wordnet Extensions</strong></p>
<p>A file may contain a lexicon extension which serves to augment an existing lexicon with new lexical entries, synsets, senses, relations, etc. They are defined much like regular lexicons, but the <code><Extends></code> element specifies the ID and version of the base lexicon:</p>
<pre><code> <LexiconExtension id="ewn-cs-example"
label="English WordNet Computer Science Terms (example)"
language="en"
email="goodmami@uw.edu"
license="https://creativecommons.org/publicdomain/zero/1.0/"
version="1.0">
<Extends id="ewn" version="2020" /></code></pre>
<p>The contents of the lexicon extension are the same as a regular lexicon with the addition of elements for external lexical entries, synsets, and senses. There are two uses of external elements. First, they allow one to add additional information to the corresponding element in the base lexicon, such as adding a new sense to an existing lexical entry:</p>
<pre><code> <ExternalLexicalEntry id="ewn-process-n">
<Sense id="ewn-process-n-20000123" synset="ewn-20000123-n" />
</ExternalLexicalEntry></code></pre>
<p>In the above example, the <code>ewn-process-n</code> ID is not used to create a new lexical entry, but rather it must already exist in the base lexicon. The external lexical entry (as well as other external senses or synsets) may only add information; therefore it may not specify metadata or elements required on lexical entries, such as for the lemma.</p>
<p>Second, they introduce an ID which may be referenced by new structures, such as the target of synset relation:</p>
<pre><code> <ExternalSynset id="ewn-06581154-n" />
<Synset id="ewn-20000123-n" ili="" partOfSpeech="n">
<Definition>a running instance of a computer program</Definition>
<SynsetRelation relType="hypernym" target="ewn-06581154-n" />
</Synset></code></pre>
<p>Due to the way external IDs are used, a lexicon extension may not exist in the same file as the base lexicon.</p>
<p><strong>Wordnet Dependencies</strong></p>
<p>Some wordnets depend upon others, such as those in the <a href="https://lr.soh.ntu.edu.sg/omw/">Open Multilingual Wordnet</a> which depend upon the Princeton WordNet for synset structure. With the <code><Requires></code> element, it is possible to explicitly codify those dependencies:</p>
<pre><code> <Lexicon id="spawn"
label="Multilingual Central Repository"
language="es"
email="bond@ieee.org"
license="https://creativecommons.org/licenses/by/3.0/"
version="1.3+omw"
citation="Aitor Gonzalez-Agirre, Egoitz Laparra and German Rigau. 2012. `Multilingual Central Repository version 3.0: upgrading a very large lexical knowledge base &lt;http://adimen.si.ehu.es/web/sites/all/modules/pubdlcnt/pubdlcnt.php?file=http://adimen.si.ehu.es/~rigau/publications/gwc12-glr.pdf&amp;nid=18&gt;`_. In *Proceedings of the 6th Global WordNet Conference (GWC 2012)*. Matsue, Japan."
url="http://adimen.si.ehu.es/web/MCR/"
dc:publisher="Global Wordnet Association"
dc:format="OMW-LMF"
dc:description="Wordnet made from OMW 1.0 data"
confidenceScore="1.0">
<Requires id="pwn" version="3.0" /></code></pre>
<p>This element signifies to an application processing the wordnet that the required wordnet should be loaded as well. The <code><Requires></code> element may also be used on a <code><LexiconExtension></code> for cases where the lexicon extends one wordnet but requires another.</p>
<h1 id="json">JSON</h1>
<p>The JSON format follows that of the XML and is based on <a href="http://json-ld.org">JSON-LD</a> <a href="https://github.com/globalwordnet/schemas/blob/master/example.json">An example</a> of the JSON is as follows:</p>
<p>The top level of a JSON graph consists of an object with two properties <code>@context</code> which must be the fixed string referring to the JSON-LD context and <code>@graph</code> giving the lexicon format. This structure is required for submission to the Collaborative Interlingual Index, but web services may of course return shorter fragments of the structure.</p>
<pre><code>{
"@context": "http://globalwordnet.github.io/schemas/wn-json-context-1.3.json",
"@graph": [{</code></pre>
<p>The following are required properties of every WordNet (note the language must be given twice). <code>@id</code> gives the identifier of this wordnet (should be unique in this document) and <code>@type</code> must be <code>lime:Lexicon</code>.</p>
<pre><code> "@context": { "@language": "en" },
"@id": "example-en",
"@type": "lime:Lexicon",
"label": "Example wordnet (English)",
"language": "en",
"email": "john@mccr.ae",
"rights": "https://creativecommons.org/publicdomain/zero/1.0/",
"version": "1.0",</code></pre>
<p>In addition the properties <code>citation</code>, <code>url</code>, <code>logo</code>, <code>status</code>, <code>confidenceScore</code> and any property from <a href="http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements">Dublin Core Elements 1.1</a> May be used</p>
<pre><code> "citation": "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016).",
"url": "http://globalwordnet.github.io/schemas",
"publisher": "Global Wordnet Association",</code></pre>
<p>The entries are given as a list under the <code>entry</code> property it requires an <code>@id</code> <code>partOfSpeech</code> and <code>lemma</code> and may have <code>sense</code>, <code>synBehavior</code>, <code>status</code>, <code>confidence</code> and Dublin Core properties. The lemma has only a single value <code>writtenForm</code> and the <code>partOfSpeech</code> must be one of the following: [ <code>noun</code>, <code>verb</code>, <code>adjective</code>, <code>adverb</code>, <code>adjective_satellite</code>, <code>phrase</code>, <code>conjunction</code>, <code>adposition</code>, <code>other</code>, <code>unknown</code> ]. The <code>@id</code> must be unique in the document, it is not the same as the <code>@id</code> of the wordnet or any other entry.</p>
<pre><code> "entry": [{
"@id" : "w1",
"lemma": { "writtenForm": "father" },
"partOfSpeech": "noun",</code></pre>
<p>The Sense requires only an <code>@id</code> and a <code>synsetRef</code> and may take <code>status</code>, <code>confidenceScore</code>, Dublin Core properties, an <code>example</code>.</p>
<pre><code> "sense": [{
"@id": "example-en-10161911-n-1",
"synsetRef": "example-en-10161911-n"
}]
}, {
"@id" : "w2",
"lemma": { "writtenForm": "paternal grandfather" },
"partOfSpeech": "noun",
"sense": [{
"@id": "example-en-1-n-1",
"synsetRef": "example-en-1-n",</code></pre>
<p>A sense may also have any number of <code>relations</code> which have a <code>relType</code> from the list below and a <code>target</code> and may have Dublin Core properties</p>
<ul>
<li><p><code>antonym</code>: An opposite and inherently incompatible word</p></li>
<li><p><code>also</code>: See also, a reference of weak meaning</p></li>
<li><p><code>participle</code>: An adjective that is a participle form a verb</p></li>
<li><p><code>pertainym</code>: A relational adjective. Adjectives that are pertainyms are usually defined by such phrases as “of or pertaining to” and do not have antonyms. A pertainym can point to a noun or another pertainym</p></li>
<li><p><code>derivation</code>: A word that is derived from some other word</p></li>
<li><p><code>domain_topic</code>: Indicates the category of this word</p></li>
<li><p><code>domain_member_topic</code>: Indicates a word involved in this category described by this word</p></li>
<li><p><code>domain_region</code>: Indicates the region of this word</p></li>
<li><p><code>domain_member_region</code>: Indicates a word involved in the region described by this word</p></li>
<li><p><code>exemplifies</code>: Indicates the usage of this word</p></li>
<li><p><code>is_exemplified_by</code>: Indicates a word involved in the usage described by this word</p>
<pre><code> "relations": [{
"relType": "derivation",
"target": "example-en-10161911-n-1",
"creator": "John McCrae"
}]
}]
}, {
"@id": "w3",
"lemma": { "writtenForm": "pay" },
"partOfSpeech": "verb",</code></pre></li>
</ul>
<p>The syntactic behavior is given here as follows:</p>
<pre><code> "synBehavior": [
{"label": "Somebody ----s", "@id": "intransitive"},
{"label": "Somebody ----s somebody", "@id": "transitive"}
]
}],</code></pre>
<p>Synsets are listed under the <code>synset</code> property. A synset requires only an <code>@id</code>. It may take an <code>ili</code> which is a code from the CILI (starting with <code>ili:i</code>), a <code>definition</code>, an <code>iliDefinition</code> (which must be given in English), <code>status</code>, <code>confidenceScore</code>, <code>relations</code> and Dublin Core properties.</p>
<p>In contrast to the XML form the <code>ili</code> is optional. If there is no match omit this tag, if you wish to propose a new synset add only a <code>iliDefinition</code>.</p>
<pre><code> "synset": [{
"@id": "example-en-10161911-n",
"partOfSpeech": "noun",
"ili": "ili:i90287",</code></pre>
<p>Definitions must have a <code>gloss</code> and may be have a <code>language</code>, in addition, <code>status</code>, <code>confidenceScore</code> and Dublin Core properties may be added. An <code>iliDefinition</code> is the same but may not have a language.</p>
<pre><code> "definition": [{
"gloss": "that which is perceived or known or inferred to have its own distinct existence (living or nonliving)"
}],</code></pre>
<p>Synset relations are given as for sense relations except the <code>target</code> must be the <code>@id</code> of another synset not a sense. The following properties can be used:</p>
<ul>
<li><p><code>hypernym</code>: A concept with a broader meaning</p></li>
<li><p><code>hyponym</code>: A concept with a narrower meaning</p></li>
<li><p><code>instance_hypernym</code>: The class of objects to which this instance belongs</p></li>
<li><p><code>instance_hyponym</code>: An individual instance of this class</p></li>
<li><p><code>part_holonym</code>: A larger whole that this concept is part of</p></li>
<li><p><code>part_meronym</code>: A part of this concept</p></li>
<li><p><code>member_holonym</code>: A group that this concept is a member of</p></li>
<li><p><code>member_meronym</code>: A member of this concept</p></li>
<li><p><code>substance_holonym</code>: Something where a constituent material is this concept</p></li>
<li><p><code>substance_meronym</code>: A constituent material of this concept</p></li>
<li><p><code>entail</code>: A verb X entails Y if X cannot be done unless Y is, or has been, done.</p></li>
<li><p><code>cause</code>: A verb that causes another</p></li>
<li><p><code>similar</code>: Similar, though not necessarily interchangeable</p></li>
<li><p><code>also</code>: See also, a reference of weak meaning</p></li>
<li><p><code>attribute</code>: A noun for which adjectives express values. The noun weight is an attribute, for which the adjectives light and heavy express values.</p></li>
<li><p><code>domain_topic</code>: Indicates the category of this word</p></li>
<li><p><code>domain_member_topic</code>: Indicates a word involved in this category described by this word</p></li>
<li><p><code>domain_region</code>: Indicates the region of this word</p></li>
<li><p><code>domain_member_region</code>: Indicates a word involved in the region described by this word</p></li>
<li><p><code>exemplifies</code>: Indicates the usage of this word</p></li>
<li><p><code>is_exemplified_by</code>: Indicates a word involved in the usage described by this word</p>
<pre><code> "relations": [{
"relType": "hypernym", "target": "example-en-10162692-n"
}],</code></pre></li>
</ul>
<p>Indicate the members and the order they occur in:</p>
<pre><code> "members": ["example-en-10161911-n-1", "example-en-1-n-1"]
}, {
"@id": "example-en-1-n",
"partOfSpeech": "noun",
"definition": [{
"gloss": "the father of your father or mother"
}],
"iliDefinition": {
"gloss": "the father of your father or mother",
"source": "https://en.wiktionary.org/wiki/farfar"
},
"relations": [
{ "relType": "hypernym", "target": "example-en-10162692-n" }
]
}]
}, {
"@context": { "@language": "sv" },
"@id": "example-sv",
"@type": "lime:Lexicon",
"label": "Example wordnet (Swedish)",
"language": "sv",
"email": "john@mccr.ae",
"license": "https://creativecommons.org/publicdomain/zero/1.0/",
"version": "1.0",
"citation": "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016).",
"url": "http://globalwordnet.github.io/schemas",
"publisher": "Global Wordnet Association",
"entry": [{
"@id" : "w4",
"lemma": { "writtenForm": "farfar" },
"form": [{ "writtenForm": "farfäder", "tag": [{ "category": "penn", "value": "NNS" }] }],
"partOfSpeech": "noun",</code></pre>
<p>Any examples should be given on the sense as follows:</p>
<pre><code> "sense": [{
"@id": "example-sv-2-n-1",
"synsetRef": "example-en-1-n",
"example": [{
"value": "Jag vill berätta för er att min farfar var svensk beredskapssoldat vid norska gränsen under andra världskriget, ett krig som Sverige stod utanför",
"source": "Europarl Corpus"
}]
}]
}]
}]
}</code></pre>
<p>The JSON format can be validated by the <a href="http://json-schema.org">JSON Schema</a> provided at https://github.com/globalwordnet/schemas/blob/master/wn-json-schema.json</p>
<h1 id="rdf">RDF</h1>
<p>We acknowledge the existence of two vocabularies to wordnet encoding. The wn-simple.ttl is based on the <a href="https://www.w3.org/TR/wordnet-rdf/">W3C RDF/OWL Representation of WordNet</a>. This vocabulary is a straightforward encoding in RDF of the original Princeton data model where synsets, word senses, and words are the main classes. In the current version, new relations are added and additional axioms are provided to reinforce consistency.</p>
<p>The second RDF schema is significantly more flexible and builds principally on the <a href="https://www.w3.org/2016/05/ontolex/">W3C OntoLex Model</a>. The details of the RDF serialization are principally built on those of the JSON-LD model. We include a separate tutorial here for the benefit of those who wish to create their resource natively in RDF.</p>
<p>The standard namespaces are</p>
<pre><code>@prefix dc: <http://purl.org/dc/terms/> .
@prefix ili: <http://ili.globalwordnet.org/ili/> .
@prefix lime: <http://www.w3.org/ns/lemon/lime#> .
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix synsem: <http://www.w3.org/ns/lemon/synsem#> .
@prefix wn: <https://globalwordnet.github.io/schemas/wn#> .
@prefix wordnetlicense: <http://wordnet.princeton.edu/wordnet/license/> .</code></pre>
<p>Each wordnet is an instance of the class <code>lime:Lexicon</code> and must have the following properties</p>
<ul>
<li><code>rdfs:label</code>: A name for the wordnet.</li>
<li><code>dc:language</code>: The BCP 47 identifier for your language (usually a two letter code)</li>
<li><code>schema:email</code>: An email address for the owner of the wordnet</li>
<li><code>dc:rights</code>: A link to the license of the resource</li>
<li><code>owl:versionInfo</code>: The version number of this resource</li>
</ul>
<p>The mapping to the Lemon-OntoLex model is as follows:</p>
<ul>
<li>Words are <code>ontolex:LexicalEntry</code>, they must have a <code>ontolex:canonicalForm</code> and a <code>wn:partOfSpeech</code>.</li>
<li>Senses are <code>ontolex:LexicalSense</code>, they must have a <code>ontolex:reference</code></li>
<li>Synsets are <code>ontolex:LexicalConcept</code>.</li>
<li>Definitions and examples are given by <code>skos:definition</code> and <code>skos:example</code> optionally with a <code>rdf:value</code>.</li>
</ul>
<p>A more extended example is given here:</p>
<pre><code><#example-en> a lime:Lexicon ;
rdfs:label "Example wordnet (English)"@en ;
dc:language "en" ;
schema:email "john@mccr.ae" ;
cc:license <https://creativecommons.org/publicdomain/zero/1.0/> ;
owl:versionInfo "1.0" ;
schema:citation "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)." ;
schema:url "http://globalwordnet.github.io/schemas/" ;
dc:publisher "Global Wordnet Association" ;
lime:entry <#w1>, <#w2>, <#w3> .
<#w1> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "grandfather"@en
] ;
wn:partOfSpeech wn:noun ;
ontolex:sense <#example-en-10161911-n-1> .
<#example-en-10161911-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-en-10161911-n> .
<#w2> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "paternal grandfather"@en
] ;
wn:partOfSpeech wn:noun ;
ontolex:sense <#example-en-1-n-1> .
<#example-en-1-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-en-1-n> .
[] a ontolex:SenseRelation ;
vartrans:source <#example-en-1-n-1> ;
vartrans:category wn:derivation ;
vartrans:target <#example-en-10161911-n-1> ;
dc:creator "John McCrae"@en .
<#w3> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "pay"@en
] ;
wn:partOfSpeech wn:verb ;
synsem:synBehavior <#transitive>, <#intransitive> .
<#intransitive> rdfs:label "Somebody ----s"@en .
<#transitive> rdfs:label "Somebody ----s somebody"@en .
<#example-en-10161911-n> a ontolex:LexicalConcept ;
wn:partOfSpeech wn:noun ;
skos:inScheme <#example-en> ;
wn:ili ili:i90287 ;
wn:definition [
rdf:value "the father of your father or mother"@en
] ;
wn:memberList ( <#example-en-1016911-n-1> <#example-en-1-n-1> ) .
[]
vartrans:source <#example-en-10161911-n> ;
vartrans:category wn:hypernym ;
vartrans:target <#example-en-10162692-n> .
<#example-en-1-n> a ontolex:LexicalConcept ;
wn:partOfSpeech wn:noun ;
skos:inScheme <#example-en> ;
wn:definition [
rdf:value "the father of your father or mother"@en
] ;
wn:iliDefinition [
rdf:value "the father of your father or mother"@en ;
dc:source "https://en.wiktionary.org/wiki/farfar"
] .
[]
vartrans:source <#example-en-1-n> ;
vartrans:category wn:hypernym ;
vartrans:target <#example-en-10162692-n> .
<#example-sv> a lime:Lexicon ;
rdfs:label "Example wordnet (Swedish)"@sv ;
dc:language "sv" ;
schema:email "john@mccr.ae" ;
cc:license <https://creativecommons.org/publicdomain/zero/1.0/> ;
owl:versionInfo "1.0" ;
schema:citation "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)." ;
schema:url "http://globalwordnet.github.io/schemas" ;
dc:publisher "Global Wordnet Association" ;
lime:entry <#w4> .
<#w4> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "farfar"@sv
] ;
ontolex:otherForm [
ontolex:writtenRep "farfäder"@sv ;
wn:tag [
wn:category "penn" ;
rdf:value "NNS"
]
] ;
wn:partOfSpeech wn:noun ;
wn:sense <#example-sv-2-n-1> .
<#example-sv-2-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-en-1-n> ;
wn:example [
rdf:value "Jag vill berätta för er att min farfar var svensk beredskapssoldat vid norska gränsen under andra världskriget, ett krig som Sverige stod utanför"@sv ;
dc:source "Europarl Corpus"
] .</code></pre>
</div>
</body>
</html>