Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling Token Limit Issues in Llama 3.2:3b-Instruct Model (2048 Tokens Max) #240

Open
pandiyan90 opened this issue Dec 10, 2024 · 0 comments

Comments

@pandiyan90
Copy link

I'm using the Llama 3.2:3b-instruct model and encountered the following error: 'This model's maximum context length is 2048 tokens. However, you requested 2049 tokens (1681 in the messages, 368 in the completion).' I understand this is due to exceeding the token limit, but I'd like to know:

Why does this token limit exist, and is there a technical reason for this specific constraint?
Are there any best practices or techniques for reducing token usage without losing critical context in messages or completions?

Is there an official document or update from the developers regarding this token limit or potential plans to increase it in future versions?
Are there alternative strategies (e.g., chunking, summarization, or other tricks) that others have used effectively with this model in similar scenarios?

Any insights, guidance, or links to documentation would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant