Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url loader max token #661

Open
michelson opened this issue Jun 6, 2024 · 2 comments
Open

url loader max token #661

michelson opened this issue Jun 6, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@michelson
Copy link
Contributor

michelson commented Jun 6, 2024

Describe the bug
I'm getting `This model's maximum context length is 8191 tokens, but the given text is 56273 tokens long.):``

To Reproduce
Steps to reproduce the behavior.
load an URL example: https://vadb.org/scenes/cordoba,
the chunk is gt the accepted, it will warn and then fail: (pinecone store)

Created a chunk of size 210763, which is longer than the specified 1000
[ActiveJob] [Goai::WebsiteProcessorJob] [dd6a9549-dd34-4430-a510-fc2d4a2208f2] This model's maximum context length is 8191 tokens, but the given text is 56273 tokens long.

Expected behavior
chunks properly chunked

Terminal commands & output

WARN -- : Created a chunk of size 210763, which is longer than the specified 1000
[ActiveJob] This model's maximum context length is 8191 tokens, but the given text is 56273 tokens long.
RuntimeError (An error occurred: This model's maximum context length is 8191 tokens, but the given text is 56273 tokens long.):

Desktop (please complete the following information):

  • OS: [e.g. OS X, Linux, Ubuntu, Windows]
  • Ruby version 3.2
  • Langchain.rb version 0.13.0
@michelson michelson added the bug Something isn't working label Jun 6, 2024
@andreibondarev
Copy link
Collaborator

@michelson Could you please show me the code you're running?

@raulalexe
Copy link

I got the same issue loading a text file via the following code. I tried it out in Ruby version 3.1 with gem version 0.14.0
client.add_data(paths: [my_long_file])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants