Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(google-generativeai: 0.8.1) Send the transparency PNG but look like the "gemini-pro" convert it to jpg. #567

Open
Pjumpod opened this issue Sep 23, 2024 · 13 comments
Assignees
Labels
component:python sdk Issue/PR related to Python SDK status:triaged Issue/PR triaged to the corresponding sub-team type:bug Something isn't working

Comments

@Pjumpod
Copy link

Pjumpod commented Sep 23, 2024

Description of the bug:

My code is

> if imageext.upper() == ".PNG":
>         print("Make blank")
>         rez_img = rez_img.convert("RGBA")
>         print(rez_img.mode)
>         resize_img_path = os.path.join(save_path,"rez_" + os.path.basename(img_path))
>         rez_img.save(resize_img_path)
>         rez_img = pilimg.open(img_path)
>         print(getattr(rez_img, "get_format_mimetype", None))
>     model_use = genai.GenerativeModel(model_name=model)
>     try:
>         response = model_use.generate_content([system_prompt, rez_img], safety_settings=safety_settings)
>         response_text = str(response.text)
>     except Exception as e:
>         response_text = str(f"{e}")

image
image

and here is the output/
`

Make blank
RGBA
<bound method ImageFile.get_format_mimetype of <PIL.PngImagePlugin.PngImageFile image mode=RGBA size=3375x1894 at 0x1A0F83C6E80>>

`
image

From this link, It should upload from generate_content as PNG and transparency mode.
as show in #523 but when I got the output of "describe the image", I found the word, "on black background" which is mean the PNG with RGBA was convert to RGB.

Actual vs expected behavior:

expect to upload as PNG with RGBA.
but actual still RGB.

image
image

Any other information you'd like to share?

google-generativeai 0.8.1

ByeIO may not able to decode the alpha channel of an image.
I attached this in the code review.
image
and Here is from stackoverflow.

@manojssmk manojssmk self-assigned this Sep 24, 2024
@manojssmk manojssmk added status:triaged Issue/PR triaged to the corresponding sub-team type:help Support-related issues component:other Questions unrelated to SDK component:python sdk Issue/PR related to Python SDK and removed component:other Questions unrelated to SDK labels Sep 24, 2024
@manojssmk
Copy link

Hi @Pjumpod

I've tested the code and images you mentioned, and it works correctly, producing a light blue background. You can check it out in this gist. I don't believe the issue lies with BytesIO. The format parameter used when saving the PIL image ensures that transparency is preserved.

Thanks

@Pjumpod
Copy link
Author

Pjumpod commented Sep 24, 2024

@manojssmk the light blue that show in this issue is the color from my app background, not the actually picture.

The picture should not have any background. (Transparency.)

@Pjumpod
Copy link
Author

Pjumpod commented Sep 24, 2024

@manojssmk you might try with your gist again with my test set pictures.
btc
A
F
R
T

@Pjumpod
Copy link
Author

Pjumpod commented Sep 25, 2024

@manojssmk you have to use the picture with rgba mode which it is the blank background. (Background should not have any color).

When you got the light blue background, that also showing your answer is still wrong.

@manojssmk
Copy link

Hi @Pjumpod

Yes, you're correct. The image with a blank background that was passed to the model is producing an incorrect output, showing the background as black. You can find the code in this gist.

Thanks

@manojssmk manojssmk assigned MarkDaoust and unassigned manojssmk Sep 25, 2024
@manojssmk manojssmk added type:bug Something isn't working and removed type:help Support-related issues labels Sep 25, 2024
@Pjumpod
Copy link
Author

Pjumpod commented Sep 25, 2024

Hi @Pjumpod

Yes, you're correct. The image with a blank background that was passed to the model is producing an incorrect output, showing the background as black. You can find the code in this gist.

Thanks

@manojssmk @MarkDaoust this is great, now we are in sync.
I think this can fix on the server site to convert the picture to RGBA mode follow by mime type.
or I am not sure if anything can fix on API at client site?

@MarkDaoust
Copy link
Collaborator

MarkDaoust commented Sep 25, 2024

I haven't looked into this. But the behavior will be affected by: #570, that PR ensures that we don't process the images before sending them.

Try installing from main:

pip install git+https://github.com/google-gemini/generative-ai-python

But it's possible that the PR doesn't change anything: the API may handle the alpha channel by showing the model the picture over a black background. If the API isn't passing an actual alpha channel, there's not much I can do in the SDK.

@MarkDaoust
Copy link
Collaborator

MarkDaoust commented Sep 25, 2024

Testing a bit, I'm just not convinced that the model uses the alpha channel at all.

  • If I make an image totally transparent the model still describes it.
  • If I ask it why I can't see anything in the image pro says "I can see it, maybe there's something wrong with your display"
  • If I set different colors of transparent sections, the model reports the "correct" background color.

b/369593779

@Pjumpod
Copy link
Author

Pjumpod commented Sep 25, 2024

Testing a bit, I'm just not convinced that the model uses the alpha channel at all.

  • If I make an image totally transparent the model still describes it.
  • If I ask it why I can't see anything in the image pro says "I can see it, maybe there's something wrong with your display"
  • If I set different colors of transparent sections, the model reports the "correct" background color.

do you have any idea or if google can help?

@MarkDaoust
Copy link
Collaborator

I think this is happening in the API backend. I think there's nothing we can do from out here.

@Pjumpod
Copy link
Author

Pjumpod commented Sep 26, 2024

Do you have any idea to report this bug to backend?

@MarkDaoust
Copy link
Collaborator

@Pjumpod , I did this morning. The b/369593779 in my previous message was an internal bug reference.

@Pjumpod
Copy link
Author

Pjumpod commented Oct 16, 2024

@MarkDaoust long time no see, Do you have any update from the API backend?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:python sdk Issue/PR related to Python SDK status:triaged Issue/PR triaged to the corresponding sub-team type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants