Skip to content

A monolingual parallel corpus for sentence simplification

Notifications You must be signed in to change notification settings

tmu-nlp/sscorpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

sscorpus: A monolingual parallel corpus for sentence simplification

This corpus contains 492,993 aligned sentences extracted by pairing Simple English Wikipedia with English Wikipedia. These source data were downloaded in May 2016.

The form of each line in the corpus: original sentence <TAB> simple sentence <TAB> similarity score

For questions, please contact Tomoyuki Kajiwara at Tokyo Metropolitan University.

About

A monolingual parallel corpus for sentence simplification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published