Skip to content

AgainsTurb/Chinese-Keywords-Extraction-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Chinese Sentence–Keyword Dataset for Keyword Generation

This dataset is designed for training and evaluating Chinese keyword generation models based on sequence-to-sequence architectures such as BART.


Dataset Format

The dataset is stored as a single JSON file, where each entry is a mapping from a Chinese sentence (user comment or post) to a list of semantic keywords.

{
  "这款手机性价比很高,值得购买。": "性价比 购买 值得",
  "我认为未来新能源车会成为主流。": "新能源车 主流 趋势",
  ...
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published