Elementary discourse unit segmentation #225
Labels
corpus
corpus/dataset-related issues
enhancement
enhance functionalities
help wanted
no contributor yet
Milestone
Elementary discourse unit (EDU) is a linguistic unit that larger than a word and smaller than sentence. It contains one piece of information. EDUs and their relationships are basis for constructing discourse structure.
Listed here are papers that discuss how to do Thai EDU segmentation computationally:
We may start this by compiling a list of Thai discourse markers.
Related to #73 (Sentence tokenizer for Thai)
The text was updated successfully, but these errors were encountered: