-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent: "No codeblocks detected in LLM response" for several files with #350
Comments
I'm seeing this consistently with bedrock and updating a big file. In order for the source code diff to actually render appropriately in in the IDE, I need the file in full. So I explicitly added into the prompt that I wanted the updated file in full, and it never has enough room in the response to give it to me. |
@dymurray when you access via Bedrock what model did you see issues with? I have used claude 3.5 sonnet and seen issues. To date we've done more testing with llama3 and mixtral and not much with claude 3.5 sonnet. I have 2 initial thoughts:
I think it's very likely our issue is from not modifying the prompt sufficiently for Claude. We can likely get more info on the context size by looking at response metadata.
Example: |
I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening. |
@jmontleon I agree, I had assumed no intelligence and model would stream and get cut off, yet when I saw this the model intentionally omitted code, so it wasn't cut off, it made a choice to strip code out and give me a condensed output.
|
I've seen the above behavior, and also just stopping midstream and cutting off. I have been using
|
We were able to find that modifying the config with the following increased the output result with bedrock. I believe @dymurray finally had success with smaller files using this, although results for larger files were still cut off.
Unfortunately this is the max_gen_len for llama models on bedrock |
Related to #391 |
I have been working on resolving the issue and have successfully identified an optimal solution for the primary problems:
Here is the detailed documentation regarding this issue. |
Thank you @devjpt23 for the extensive deep dive into this problem and sharing what you learned! |
@jwmatthews based on the discussion can we close this? |
I am seeing consistent and repeatable issues with several files in Coolstore when I run against claude 3.5 sonnet.
It looks like the output stops suddenly midway through generating an update.
Config:
Error snippet:
Attempting to convert:
prompt:
llm_result (all failures, stops prematurely)
Note on a subsequent retry it failed once more and then succeeded but the contents of what it generated are incomplete/truncated
1 more failure: https://gist.github.com/jwmatthews/7d7aac70a6b69291e2ff0ed2b467debb
Partial Success but Incomplete: https://gist.github.com/jwmatthews/0b366ffa4ff8fe2ed89638552e9972e9
It truncates the response and adds a comment
// Rest of the class remains unchanged
The text was updated successfully, but these errors were encountered: