Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 1.27 KB

README_en.md

File metadata and controls

12 lines (8 loc) · 1.27 KB

For better user experience, refer to the Web official document -> Language Model

Language Model

  • Recommended Model
Model Name Module Introduction
Word embedding model In the massive Baidu search dataset, the Chinese character pre-training word embedding is obtained through pre-training. It supports Fine-tune. The vocabulary list size of Word2vec's pre-training dataset is 1700249. The word embedding dimension is 128.
Text similarity Based on the two texts entered by a user, the score of the text similarity is calculated.

| ERNIE |Based on Chinese corpus self-developed models such as encyclopedia, information, and forum dialogue data, it can be used for tasks such as text classification, sequence annotation, and reading comprehension.