Consistent: "No codeblocks detected in LLM response" for several files with #350

jwmatthews · 2024-09-04T10:54:05Z

I am seeing consistent and repeatable issues with several files in Coolstore when I run against claude 3.5 sonnet.
It looks like the output stops suddenly midway through generating an update.

Config:

[models]
provider = "ChatBedrock"

[models.args]
model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0"

Error snippet:

WARNING - 2024-09-04 06:50:33,902 - kai.models.file_solution - [    file_solution.py:95   - parse_file_solution_content()] - No codeblocks detected in LLM response
WARNING - 2024-09-04 06:50:33,907 - kai.service.kai_application.kai_application - [  kai_application.py:202  - get_incident_solutions_for_file()] - Request to model failed for batch 1/1 for src/main/java/com/redhat/coolstore/model/ShoppingCart.java with exception, retrying in 10.0s
Error in LLM Response: The LLM did not provide an updated file for src/main/java/com/redhat/coolstore/model/ShoppingCart.java

Attempting to convert:

https://github.com/konveyor-ecosystem/coolstore/blob/main/src/main/java/com/redhat/coolstore/service/ShoppingCartService.java

prompt:

https://gist.github.com/jwmatthews/c1dc0e251a9bd40e1efb83172e383439

llm_result (all failures, stops prematurely)

Note on a subsequent retry it failed once more and then succeeded but the contents of what it generated are incomplete/truncated

1 more failure: https://gist.github.com/jwmatthews/7d7aac70a6b69291e2ff0ed2b467debb

Partial Success but Incomplete: https://gist.github.com/jwmatthews/0b366ffa4ff8fe2ed89638552e9972e9
It truncates the response and adds a comment // Rest of the class remains unchanged

package com.redhat.coolstore.service;

import java.util.Hashtable;
import java.util.logging.Logger;

import jakarta.ejb.Stateful;
import jakarta.inject.Inject;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;

import jakarta.enterprise.context.SessionScoped;
import java.io.Serializable;

import com.redhat.coolstore.model.Product;
import com.redhat.coolstore.model.ShoppingCart;
import com.redhat.coolstore.model.ShoppingCartItem;

@SessionScoped
public class ShoppingCartService implements Serializable {

    private static final long serialVersionUID = 1L;

    @Inject
    Logger log;

    @Inject
    ProductService productServices;

    @Inject
    PromoService ps;

    @Inject
    ShoppingCartOrderProcessor shoppingCartOrderProcessor;

    private ShoppingCart cart  = new ShoppingCart(); //Each user can have multiple shopping carts (tabbed browsing)

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}

The text was updated successfully, but these errors were encountered:

jwmatthews · 2024-09-04T10:55:06Z

Related to:

[Bug] Parsing of 'updated file' contents looks broken for some situations - unable to handle multiple ways of expressing 'Updated File' #208
'mixtral': [pydantic_models.py:71 - parse_file_solution_content() ] - No codeblocks detected in LLM response #126

dymurray · 2024-09-12T20:56:19Z

I'm seeing this consistently with bedrock and updating a big file. In order for the source code diff to actually render appropriately in in the IDE, I need the file in full. So I explicitly added into the prompt that I wanted the updated file in full, and it never has enough room in the response to give it to me.

jwmatthews · 2024-09-12T21:18:09Z

@dymurray when you access via Bedrock what model did you see issues with? I have used claude 3.5 sonnet and seen issues. To date we've done more testing with llama3 and mixtral and not much with claude 3.5 sonnet.

I have 2 initial thoughts:

We've exceeded a context size for the response and the model is smart enough to truncate the output, this could be...but I'm not sure it's the case
We need to tweak the prompt more for claude, below are 2 links that may help us learn more:
- https://github.com/anthropics/prompt-eng-interactive-tutorial/tree/master
- @savitharaghunathan recently shared https://github.com/aws-samples/claude-prompt-generator

I think it's very likely our issue is from not modifying the prompt sufficiently for Claude.

We can likely get more info on the context size by looking at response metadata.
I have been working with @devjpt23 and he shared the below.

sample code from @devjpt23

ai_msg = llm.invoke(messages)
ai_msg.response_metadata['token_usage']['completion_tokens']

Example:
{\\n\\t\' response_metadata={\'token_usage\': {\'completion_tokens\': 738, \'prompt_tokens\': 1122, \'total_tokens\': 1860, \'completion_time\': 1.192732671, \'prompt_time\': 0.056392911, \'queue_time\': 0.0009406290000000053, \'total_time\': 1.249125582}, \'model_name\': \'mixtral-8x7b-32768\', \'system_fingerprint\': \'fp_c5f20b5bb1\', \'finish_reason\': \'stop\', \'logprobs\': None}

jmontleon · 2024-09-12T21:25:42Z

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

jwmatthews · 2024-09-12T21:37:07Z

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

@jmontleon I agree, I had assumed no intelligence and model would stream and get cut off, yet when I saw this the model intentionally omitted code, so it wasn't cut off, it made a choice to strip code out and give me a condensed output.

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}

dymurray · 2024-09-12T21:39:17Z

I've seen the above behavior, and also just stopping midstream and cutting off.

I have been using

model_id = "meta.llama3-70b-instruct-v1:0"

jmontleon · 2024-09-17T18:14:09Z

We were able to find that modifying the config with the following increased the output result with bedrock. I believe @dymurray finally had success with smaller files using this, although results for larger files were still cut off.

[models.args]
model_id = "meta.llama3-70b-instruct-v1:0"
model_kwargs.max_gen_len = 2048

Unfortunately this is the max_gen_len for llama models on bedrock
https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

jwmatthews · 2024-09-26T17:17:12Z

Related to #391

devjpt23 · 2024-11-21T14:06:46Z

I have been working on resolving the issue and have successfully identified an optimal solution for the primary problems:

The "No codeblocks" error.
Unnecessary comments

Here is the detailed documentation regarding this issue.

jwmatthews · 2024-11-21T15:17:33Z

Thank you @devjpt23 for the extensive deep dive into this problem and sharing what you learned!

shawn-hurley · 2025-01-10T19:06:31Z

@jwmatthews based on the discussion can we close this?

devjpt23 mentioned this issue Oct 20, 2024

Fix no codeblock error by re-prompting to LLM's incomplete response #436

Open

devjpt23 mentioned this issue Oct 27, 2024

Merge prompt code with llm response #458

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent: "No codeblocks detected in LLM response" for several files with #350

Consistent: "No codeblocks detected in LLM response" for several files with #350

jwmatthews commented Sep 4, 2024 •

edited

Loading

jwmatthews commented Sep 4, 2024

dymurray commented Sep 12, 2024

jwmatthews commented Sep 12, 2024

jmontleon commented Sep 12, 2024 •

edited

Loading

jwmatthews commented Sep 12, 2024

dymurray commented Sep 12, 2024

jmontleon commented Sep 17, 2024

jwmatthews commented Sep 26, 2024

devjpt23 commented Nov 21, 2024

jwmatthews commented Nov 21, 2024

shawn-hurley commented Jan 10, 2025

Consistent: "No codeblocks detected in LLM response" for several files with #350

Consistent: "No codeblocks detected in LLM response" for several files with #350

Comments

jwmatthews commented Sep 4, 2024 • edited Loading

jwmatthews commented Sep 4, 2024

dymurray commented Sep 12, 2024

jwmatthews commented Sep 12, 2024

jmontleon commented Sep 12, 2024 • edited Loading

jwmatthews commented Sep 12, 2024

dymurray commented Sep 12, 2024

jmontleon commented Sep 17, 2024

jwmatthews commented Sep 26, 2024

devjpt23 commented Nov 21, 2024

jwmatthews commented Nov 21, 2024

shawn-hurley commented Jan 10, 2025

jwmatthews commented Sep 4, 2024 •

edited

Loading

jmontleon commented Sep 12, 2024 •

edited

Loading