The format is the same as that GlobalWordNet LMF Schema. In addition, the following criteria should be followed in creating the resource
There must be a lexical entry for each synset member in the same file as the synset. Its ID must be as follows
ewn-lemma-p
Where
lemma
is the lemma of the word. Any spaces should be replaced by underscores (_
). Other non-XML characters should be replaced with-xx-
where xx is a two letter code (e.g.,-lp-
for(
and-rp-
for)
)p
is the part of speech. One ofn
(noun),v
(verb),a
(adjective),s
(adjective satellite) orr
(adverb)
Senses must have identifiers that correspond to lexical entry and are of the form
ewn-lemma-p-XXXXXXXX-YY
lemma
andp
are as for lexical entriesXXXXXXXX
is the offset code from Princeton WordNet's 3.1 release (see below for novel synsets)YY
is the position of the word in the synset as a two letter decimal code. Values for this should start from01
and must exist for all consecutive values up to the size of the synset
In addition, senses should have a dc:identifier
that gives their sense identifier as in Princeton
WordNet.
Synset relations are followed by a comment giving all members of the synset.
Synsets have an identifier as follows
ewn-XXXXXXXX-p
XXXXXXXX
is the offset code from Princeton WordNet's 3.1 release. For novel synsets the code should start with a2
and the number should be chosen increasinglyp
is the part of speech (see lexical entries)
Synsets should also have an ili
link. When defining novel concepts please
give the value ili="in"
.
Synsets should have a partOfSpeech
and a dc:subject
which corresponds to the
name of the file being defined
All members of a synset should be listed in a comment before the synset
Synset relations are followed by a comment giving all members of the target synset.
Symmetric pairs (hypernym
and hyponym
) should be added, when defining the
synset relation.
Examples typically start and end with "
. If a source is needed for an
example please include this after the final "
with two dashes, e.g.,
<Example>"the harder the conflict the more glorious the triumph"--Thomas Paine</Example>
- Please use two spaces as indents
- Use self-closing tags whenever possible
- Use minimal whitespace