Skip to content

consolidating vocabulary data relevant to the Greek Learner Texts Project

Notifications You must be signed in to change notification settings

greek-learner-texts/vocabulary-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vocabulary-data

consolidating vocabulary data relevant to the Greek Learner Texts Project

Sources

These GitHub repositories:

and this spreadsheet:

and also:

The raw data from these sources can be found in raw/.

Terminology

We use the term lexeme, entry, lexical item, vocabulary item or just item to refer to the object being distinguished including all its properties.

We use the term lemma, headword, dictionary form, citation form, or just key to refer to the string that identifies the lexical item.

Sometimes a distinction is made between the bare form used as a lemma (e.g. λόγος) and the full headword or citation form (e.g. λόγος, ου, ὁ). There are also cases where a numeral or gloss is added to distinguish homographs.

Also see Linking Lexical Resources for Biblical Greek for a talk from SBL 2017 on additional issues and concepts.

Categories of Data

  • We just have an unordered list of items with no additional properties (so effectively just the keys).

  • We have an unordered list of items but they contain additional properties (perhaps a gloss, part-of-speech, inflectional class, semantic domain, etc).

  • We have some mapping between the items in two lists which might differ in the key used in each list.

  • We have corpus-specific frequency information on each item.

  • We have ordering (either explicit in the data or based on sorted on some property, whether it be lexicographical sorting of the key or ordering based on frequency in a particular corpus, or something else).

About

consolidating vocabulary data relevant to the Greek Learner Texts Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages