-
Notifications
You must be signed in to change notification settings - Fork 45k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maximum context length exceeded after browse_website
#796
Comments
+1 from my side - I'm facing this problem very often too |
It has to do with when it's tryijng to create the embeddings. I went down a rabbit hole with this trying to chunk the embeddings which got past that error, but then AutoGPT got into a never ending loop as it was loading too much context and having to prune it seemingly forever. I think we need to come up with a better solution to reading, summarizing, and reviewing large documentation, but haven't come up with a good solution yet. I had already started going down the path of model training to deal with this before AutoGPT was released, and I think will probably end up being the correct solution. Any suggestions how best to implement it? |
+1 here, and can't find a way to solve it |
same here. The prompt is too large when the AI use COMMAND = read_file ARGUMENTS = {'file': 'file_name.txt'} and the file is too large I get this error : Traceback (most recent call last): # Paste your prompt here ai_goals:
|
Same problem. I am trying to get it to read a text file, and there are just a few lines in the file, so I know it is not too much text. The text file is in the workspace folder. It just won't read it, but it can make files there. |
I think read_file maybe needs to be extended or potentially broken out into
multiple sub-functions. For things as simple as appending 2 or more files
together, that could be handled with a separate command that just uses the
native OS & filesystem for stuff like that which would be way faster, and
cheaper. As for reading files into memory, there's a couple different
approaches that could, through embeddings and a vector DB like Pinecone,
but look at a different way to chunk large files and load them directly
into memory, bypassing the context, or the other option could be to have
the AI generate it's own commands to actually train/load a custom model
with the documents. Depending on the type of data it is, both have their
pros & cons. I'm currently thinking through this now, for documenting a SDK
that is all contained within a number of PDF's, I tried some experiments
with auto-chunking the data and storing in the Pinecone DB, but still ran
into token limit issues trying to pass enough data in to the model to get
much useful out of it. Can everyone share what their expected use cases are
and we can start looking at what they are and how to determine which method
would work best and automate the logic for deciding?
Warren
…On Thu, Apr 13, 2023 at 5:04 AM BlaisePx ***@***.***> wrote:
Same problem. I am trying to get it to read a text file, and there are
just a few lines in the file, so I know it is not too much text.
—
Reply to this email directly, view it on GitHub
<#796 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABZ3H3DDHQRRUHUCXTBIUDXA7TTDANCNFSM6AAAAAAW2FIVZA>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Regards
Warren Moxley
604-369-0420
|
I noticed the same thing for not only large text files. I got a the following error in two distinct situations as well
on a
I can understand that GPT is fed too many tokens by the outputs of those calls, however I would appreciate it if the exception was somehow handled without shutting auto-GPT down. Currently it stops the whole exectuion and I have to start with my project over, hoping that 30 minutes in it won't exit again. |
I am not an expert and I just checked out OpenAI's cookbok and I think the commit below can be a good start for a workaround; |
It could indeed. I would welcome it if you could post it as a PR and get the other devs involved! |
I tried restarting AutoGPT, but the
Does anyone know how to get rid of this notification? |
Restarting Auto-GPT won't help as this is a bug. If you carefully read the error message you'll find that Auto-GPT chokes on strings that exceed a particular length. |
It seems the problem might be inside the def split_text(text, max_length=3800, max_chunks=5):
"""Split text into chunks of a maximum length and limit the number of chunks"""
paragraphs = text.split("\n")
current_length = 0
current_chunk = []
chunk_count = 0
for paragraph in paragraphs:
if chunk_count >= max_chunks:
break
if current_length + len(paragraph) + 1 <= max_length:
current_chunk.append(paragraph)
current_length += len(paragraph) + 1
else:
yield "\n".join(current_chunk)
current_chunk = [paragraph]
current_length = len(paragraph) + 1
chunk_count += 1
if current_chunk:
yield "\n".join(current_chunk) |
It might be that there are different problems with a similar error. I applied your proposed change but my error persists. To reproduce:
it will crash with a
|
I see, thanks for checking this. I guess it's best to see what happens with the already opened PRs. I've also tried the option from #2088, it works for me, for now. PS One of the problems here is that the function and the PR above focuses on splitting text into paragraphs of particular symbols count, while the error is talking about limit in tokens. And, according to google, "For text in English, 1 token is approximately 4 characters or 0.75 words". Does it mean that make this very stable one will have to tokenise the text on the client side? |
browse_website
This error still happens when the target website is using UTF-8 and has characters in the Arabic/Persian language. |
Duplicates
Steps to reproduce 🕹
When it goes trough long documentation, it seems to reach a cap
It is trying to summarize a 486620 characters486620 characters text and give this error
I'm not sure i'm reporting this correctly. Pease let me know what i can do to improve futur bug report
Current behavior 😯
Command browse_website returned: Error: This model's maximum context length is 4097 tokens. However, your messages resulted in 5314 tokens. Please reduce the length of the messages.
Expected behavior 🤔
split the text so it can be summarized with 4790 tokens
Your prompt 📝
The text was updated successfully, but these errors were encountered: