Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent: "No codeblocks detected in LLM response" for several files with #350

Open
jwmatthews opened this issue Sep 4, 2024 · 11 comments

Comments

@jwmatthews
Copy link
Member

jwmatthews commented Sep 4, 2024

I am seeing consistent and repeatable issues with several files in Coolstore when I run against claude 3.5 sonnet.
It looks like the output stops suddenly midway through generating an update.

Config:

[models]
provider = "ChatBedrock"

[models.args]
model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0"

Error snippet:

WARNING - 2024-09-04 06:50:33,902 - kai.models.file_solution - [    file_solution.py:95   - parse_file_solution_content()] - No codeblocks detected in LLM response
WARNING - 2024-09-04 06:50:33,907 - kai.service.kai_application.kai_application - [  kai_application.py:202  - get_incident_solutions_for_file()] - Request to model failed for batch 1/1 for src/main/java/com/redhat/coolstore/model/ShoppingCart.java with exception, retrying in 10.0s
Error in LLM Response: The LLM did not provide an updated file for src/main/java/com/redhat/coolstore/model/ShoppingCart.java

Attempting to convert:

prompt:

llm_result (all failures, stops prematurely)

Note on a subsequent retry it failed once more and then succeeded but the contents of what it generated are incomplete/truncated

1 more failure: https://gist.github.com/jwmatthews/7d7aac70a6b69291e2ff0ed2b467debb

Partial Success but Incomplete: https://gist.github.com/jwmatthews/0b366ffa4ff8fe2ed89638552e9972e9
It truncates the response and adds a comment // Rest of the class remains unchanged

package com.redhat.coolstore.service;

import java.util.Hashtable;
import java.util.logging.Logger;

import jakarta.ejb.Stateful;
import jakarta.inject.Inject;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingException;

import jakarta.enterprise.context.SessionScoped;
import java.io.Serializable;

import com.redhat.coolstore.model.Product;
import com.redhat.coolstore.model.ShoppingCart;
import com.redhat.coolstore.model.ShoppingCartItem;

@SessionScoped
public class ShoppingCartService implements Serializable {

    private static final long serialVersionUID = 1L;

    @Inject
    Logger log;

    @Inject
    ProductService productServices;

    @Inject
    PromoService ps;

    @Inject
    ShoppingCartOrderProcessor shoppingCartOrderProcessor;

    private ShoppingCart cart  = new ShoppingCart(); //Each user can have multiple shopping carts (tabbed browsing)

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}
@dymurray
Copy link
Contributor

I'm seeing this consistently with bedrock and updating a big file. In order for the source code diff to actually render appropriately in in the IDE, I need the file in full. So I explicitly added into the prompt that I wanted the updated file in full, and it never has enough room in the response to give it to me.

@jwmatthews
Copy link
Member Author

@dymurray when you access via Bedrock what model did you see issues with? I have used claude 3.5 sonnet and seen issues. To date we've done more testing with llama3 and mixtral and not much with claude 3.5 sonnet.

I have 2 initial thoughts:

I think it's very likely our issue is from not modifying the prompt sufficiently for Claude.

We can likely get more info on the context size by looking at response metadata.
I have been working with @devjpt23 and he shared the below.

sample code from @devjpt23

ai_msg = llm.invoke(messages)
ai_msg.response_metadata['token_usage']['completion_tokens']

Example:
{\\n\\t\' response_metadata={\'token_usage\': {\'completion_tokens\': 738, \'prompt_tokens\': 1122, \'total_tokens\': 1860, \'completion_time\': 1.192732671, \'prompt_time\': 0.056392911, \'queue_time\': 0.0009406290000000053, \'total_time\': 1.249125582}, \'model_name\': \'mixtral-8x7b-32768\', \'system_fingerprint\': \'fp_c5f20b5bb1\', \'finish_reason\': \'stop\', \'logprobs\': None}

@jmontleon
Copy link
Member

jmontleon commented Sep 12, 2024

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

@jwmatthews
Copy link
Member Author

I could be mistaken, but I don't think there is any intelligence with returning a response when hitting the token limit. They just return what they finished generating before hitting the limit. In the case of a streaming response they'll just stream until they hit it. It would make sense if this is what is happening.

@jmontleon I agree, I had assumed no intelligence and model would stream and get cut off, yet when I saw this the model intentionally omitted code, so it wasn't cut off, it made a choice to strip code out and give me a condensed output.

    public ShoppingCartService() {
    }

    // Rest of the class remains unchanged

    private static ShippingServiceRemote lookupShippingServiceRemote() {
        try {
            final Hashtable<String, String> jndiProperties = new Hashtable<>();
            jndiProperties.put(Context.INITIAL_CONTEXT_FACTORY, "org.wildfly.naming.client.WildFlyInitialContextFactory");

            final Context context = new InitialContext(jndiProperties);

            return (ShippingServiceRemote) context.lookup("ejb:/ROOT/ShippingService!" + ShippingServiceRemote.class.getName());
        } catch (NamingException e) {
            throw new RuntimeException(e);
        }
    }
}

@dymurray
Copy link
Contributor

I've seen the above behavior, and also just stopping midstream and cutting off.

I have been using

model_id = "meta.llama3-70b-instruct-v1:0"

@jmontleon
Copy link
Member

We were able to find that modifying the config with the following increased the output result with bedrock. I believe @dymurray finally had success with smaller files using this, although results for larger files were still cut off.

[models.args]
model_id = "meta.llama3-70b-instruct-v1:0"
model_kwargs.max_gen_len = 2048

Unfortunately this is the max_gen_len for llama models on bedrock
https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

@jwmatthews
Copy link
Member Author

Related to #391

@devjpt23
Copy link
Contributor

I have been working on resolving the issue and have successfully identified an optimal solution for the primary problems:

  • The "No codeblocks" error.
  • Unnecessary comments

Here is the detailed documentation regarding this issue.

@jwmatthews
Copy link
Member Author

Thank you @devjpt23 for the extensive deep dive into this problem and sharing what you learned!

@shawn-hurley
Copy link
Contributor

@jwmatthews based on the discussion can we close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants