Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问出现这种情况:Token indices sequence length is longer than the specified maximum sequence length for this model (781 > 512),如何解决? #62

Open
starxuh opened this issue Jun 13, 2024 · 1 comment

Comments

@starxuh
Copy link

starxuh commented Jun 13, 2024

请问出现下面这种情况
Token indices sequence length is longer than the specified maximum sequence length for this model (781 > 512). Running this sequence through the model will result in indexing errors
You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
如何解决?是模型token限制吗?

@shenlei1020
Copy link
Collaborator

没关系,不影响结果,只是warning,BCEmbedding的python包已经做了处理

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants