For better user experience, refer to the Web official document -> Language Model
- Recommended Model
Model Name | Module Introduction |
---|---|
Word embedding model | In the massive Baidu search dataset, the Chinese character pre-training word embedding is obtained through pre-training. It supports Fine-tune. The vocabulary list size of Word2vec's pre-training dataset is 1700249. The word embedding dimension is 128. |
Text similarity | Based on the two texts entered by a user, the score of the text similarity is calculated. |
| ERNIE |Based on Chinese corpus self-developed models such as encyclopedia, information, and forum dialogue data, it can be used for tasks such as text classification, sequence annotation, and reading comprehension.