This repository shares the Longman Communictaion Dictionary, from top 1000 to 9000 words, in simple tsv and JSON formats.
The Longman Communication is a list of the most frequent words in both spoken and written English, based on statistical analysis of the 390 million words contained in the Longman Corpus Network – a group of corpuses or databases of authentic English language. The Longman Communication 3000 represents the core of the English language and shows students of English which words are the most important for them to learn and study in order to communicate effectively in both speech and writing.
Analysis of the Longman Corpus Network shows that these 3000 most frequent words in spoken and written English account for 86% of the language. This means that by knowing this list of words, a learner of English is in a position to understand 86% or more of what he or she reads. Of course, “knowing”a word involves more than simply being able to recognise it and know a main meaning of it. Many of the most frequent words have a range of different meanings, a variety of different grammatical patterns, and numerous significant collocations.
Nonetheless, a basic understanding of the Longman Communication is a very powerful tool and will help students develop good comprehension and communication skills in English.
The folders are designated for the files crossponding to the top x
thousand number of words in oral/writing communication. For example Longmang-Communication-9000
is the folder carrying files for the Longman-Communication 9000 words.
Within each folder, you can get words in various formats:
- a .tsv file has entries at each line where a word, its quardrant, and its parts-of-speech are separated by a tab-space.
- Threr are many .json files that hold ditcionary (data structure) whrere words are separated:
- with respect to starting alphabets
- with respect to quardrants
- with respect to parts-of-speech etc.