-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Dalle-Critic not working #2510
Comments
Looks like the message output from critic is not complete. @WaelKarkoub do you know about possible cause of this? |
@ekzhu this is new to me, maybe the API provider is limiting the number of output tokens. @nazkhan-8451 I ran your code and it works perfectly fine for me. I'm not sure how you setup your Here is my version (I don't know what is import os
from PIL.Image import Image
import autogen
from autogen.agentchat.contrib import img_utils
from autogen.agentchat.contrib.capabilities import generate_images
CRITIC_SYSTEM_MESSAGE = """You need to improve the prompt of the figures you saw.
How to create an image that is better in terms of color, shape, text (clarity), and other things.
Reply with the following format:
CRITICS: the image needs to improve...
PROMPT: here is the updated prompt!
If you have no critique or a prompt, just say TERMINATE
"""
config_list_gpt4 = [
{
"model": "gpt-4-turbo-2024-04-09",
"api_key": os.environ["OPENAI_API_KEY"],
}
]
config_list_gpt4_vision = config_list_gpt4
config_list_dalle = [
{
"model": "dall-e-3",
"api_key": os.environ["OPENAI_API_KEY"],
}
]
gpt_config = {
"cache_seed": None, # change the cache_seed for different trials
"temperature": 0.7,
"config_list": config_list_gpt4,
"timeout": 300,
}
gpt_vision_config = {
"cache_seed": None, # change the cache_seed for different trials
"temperature": 0.7,
"config_list": config_list_gpt4_vision,
"timeout": 300,
}
dalle_config = {
"cache_seed": None, # change the cache_seed for different trials
"temperature": 0.7,
"config_list": config_list_dalle,
"timeout": 300,
}
def _is_termination_message(msg) -> bool:
# Detects if we should terminate the conversation
if isinstance(msg.get("content"), str):
return msg["content"].rstrip().endswith("TERMINATE")
elif isinstance(msg.get("content"), list):
for content in msg["content"]:
if isinstance(content, dict) and "text" in content:
return content["text"].rstrip().endswith("TERMINATE")
return False
def critic_agent() -> autogen.ConversableAgent:
return autogen.ConversableAgent(
name="critic",
llm_config=gpt_vision_config,
system_message=CRITIC_SYSTEM_MESSAGE,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
def image_generator_agent() -> autogen.ConversableAgent:
# Create the agent
agent = autogen.ConversableAgent(
name="dalle",
llm_config=gpt_vision_config,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
is_termination_msg=lambda msg: _is_termination_message(msg),
)
# Add image generation ability to the agent
dalle_gen = generate_images.DalleImageGenerator(llm_config=dalle_config)
image_gen_capability = generate_images.ImageGeneration(
image_generator=dalle_gen, text_analyzer_llm_config=gpt_config
)
image_gen_capability.add_to_agent(agent)
return agent
def extract_images(sender: autogen.ConversableAgent, recipient: autogen.ConversableAgent) -> Image:
images = []
all_messages = sender.chat_messages[recipient]
for message in reversed(all_messages):
# The GPT-4V format, where the content is an array of data
contents = message.get("content", [])
for content in contents:
if isinstance(content, str):
continue
if content.get("type", "") == "image_url":
img_data = content["image_url"]["url"]
images.append(img_utils.get_pil_image(img_data))
if not images:
raise ValueError("No image data found in messages.")
return images
###################################################
dalle = image_generator_agent()
critic = critic_agent()
img_prompt = "robot"
result = dalle.initiate_chat(critic, message=img_prompt) |
@WaelKarkoub |
@nazkhan-8451 try updating to the latest autogen version, not certain if that would change anything. In your |
@WaelKarkoub here is my file. I have checked the models individually. The api-key and url are correct [
{
"model": "wag-gpt4-128k",
"api_key": "api-key",
"api_type": "azure",
"base_url": "url",
"api_version": "2024-02-15-preview",
"tags": ["wag-gpt4-128k"]
},
{
"model": "gpt-35-turbo-16k",
"api_key": "api-key",
"api_type": "azure",
"base_url": "url",
"api_version": "2024-02-15-preview",
"tags": ["gpt-35"]
},
{
"model": "gpt4-vision",
"api_key": "api-key",
"api_type": "azure",
"base_url": "url",
"api_version": "2023-12-01-preview",
"tags": ["gpt-vision"]
},
{
"model": "dall-e-3",
"api_key": "api-key",
"api_type": "azure",
"base_url": "url/",
"api_version": "2023-12-01-preview",
"tags": ["dalle"]
}
] |
@WaelKarkoub I changed the code to cache=None and upgraded to pyautogen latest. There are 2 problems I am seeing:
|
@nazkhan-8451 your config looks correct. Your |
@nazkhan-8451 i couldn't reproduce this bug, does this still happen to you? |
@WaelKarkoub It does. Not sure what am i doing wrong or how to go around it. |
@nazkhan-8451 check if you set hard limits in azure, not sure how that would look like. And if possible, check if this happens with OpenAI |
@WaelKarkoub dall-e deployment works fine because i can generate image with this
|
I don't have openAI dall-e to test it. |
@nazkhan-8451 my concern is not the image generation part, but the chat completion side of things (i.e using the gpt models). See if you can still generate large texts with gpt |
@nazkhan-8451 https://github.com/microsoft/autogen/blob/main/notebook/agentchat_image_generation_capability.ipynb Does this work for you? just change the model name, API key, etc... accordingly |
@WaelKarkoub this is giving the error: dalle (to critic): robot critic (to dalle): CRITICS: the image needs to improve the depiction of the robot to make dalle (to critic): I'm sorry for any confusion, but as an AI text-based model, I critic (to dalle): TERMINATE |
@nazkhan-8451 Disable cache again by adjusting the configs, the output is the same because it's looking through your cache. |
@nazkhan-8451 just making sure, the prompt in the notebook is different from the console output your pasted in your comment. Can you run the notebook as is and see what the output is like? Make sure you disable the cache seed as well |
@WaelKarkoub I ran the notebook as is.
|
@nazkhan-8451 yeah I'm stumped, would you mind posting it on Discord? https://aka.ms/autogen-dc. If not, I can post the issue myself as well |
@WaelKarkoub I don't have discord. If you could post, we can continue to collaborate here. Thank you for all the help. |
@nazkhan-8451 can you try using |
@WaelKarkoub Converted both of them to
Needed to fix error in
still got:
|
@WaelKarkoub I figured out where the bug is. It's the code which creates This works (https://github.com/microsoft/autogen/blob/main/notebook/agentchat_dalle_and_gpt4v.ipynb):
|
@nazkhan-8451 great catch! It's interesting how this bug affected the text output for other agents I'll have to take a look at it. Do you want to submit a PR for a fix? I don't mind doing that as well |
Please, you do that. I will make this issue closed. Thank you. |
@WaelKarkoub @nazkhan-8451 I've faced the same text output cut-off issue when testing image generation capabilities. I'm also using AzureOpenAI deployment, and finally found it may be a limitation with AzureOpenAI GPT-4 Turbo with Vision deployment. From the document, it looks like we have to set a After adding a def critic_agent() -> autogen.ConversableAgent:
return autogen.ConversableAgent(
name="critic",
llm_config={"config_list": config_list_gpt4v, "temperature": 0.7, "max_tokens": 400},
system_message=CRITIC_SYSTEM_MESSAGE,
max_consecutive_auto_reply=3,
human_input_mode="NEVER",
) |
@whiskyboy
|
@nazkhan-8451 No, I'm not using Azure Dalle. Instead I'm testing with HuggingFace text-to-image models (see #2599 ). I will try Azure Dalle later. |
Describe the bug
Followed the notebook https://github.com/microsoft/autogen/blob/main/notebook/agentchat_image_generation_capability.ipynb, but getting the following response:
![image](https://private-user-images.githubusercontent.com/108809950/325638464-a0fc3516-536b-488a-8a10-bd9ab624eb00.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3NDA4NDcsIm5iZiI6MTczOTc0MDU0NywicGF0aCI6Ii8xMDg4MDk5NTAvMzI1NjM4NDY0LWEwZmMzNTE2LTUzNmItNDg4YS04YTEwLWJkOWFiNjI0ZWIwMC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNlQyMTE1NDdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1lNGI0YjNmM2Y1MDBkZGEyZDgzZmJjYzczYWYwYzM4NTk1ZTAwZWM4ZDY5Yjk1OGYyMTdjYjY4ZDE4MDQ1NjZmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.M686XqjfROofGDJmi2jFw8Wd8uF2LJbNJl9RAs6pvCM)
Code:
Steps to reproduce
No response
Model Used
No response
Expected Behavior
No response
Screenshots and logs
No response
Additional Information
The text was updated successfully, but these errors were encountered: