How to use api to call a multi-model with local image? #89

HarryZhou-618 · 2024-04-12T10:19:24Z

Hi, I'm using the poe api to call a multimodal model, like gpt-4v or claude3-opus. I refer to an example in the diagram, but I can't find the code on how to load the local image into the request. May I know how can I implement this? I noticed that the new documentation mentions "attachment.parsed_content", should I use this? What is the format of parsed_content? Should I process the image to base64 or use binary read?
Looking for your reply

Arbow · 2024-04-13T11:32:14Z

I had a similar problem too. I wrote codes like this:

import asyncio
from typing import AsyncIterable
import fastapi_poe as fp
from sse_starlette.sse import ServerSentEvent
from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse

api_key = 'KEY'
prompt = """ Describe the attachment image in detail."""
attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \
                        content_type="image/png", name="image.png") 
message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment])

async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]:
    chuncks = []
    async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key): 
        chuncks.append(partial.text)
    print(''.join(chuncks))

asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))

I expected the Claude model to read the attached image, but it obviously did not, and returned the following information:
"Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image."

I wonder if it is possible to invoke a multi-modal via API, thanks.

HarryZhou-618 · 2024-04-13T12:25:47Z

I had a similar problem too. I wrote codes like this:
import asyncio
from typing import AsyncIterable
import fastapi_poe as fp
from sse_starlette.sse import ServerSentEvent
from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse

api_key = 'KEY'
prompt = """ Describe the attachment image in detail."""
attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \
                        content_type="image/png", name="image.png") 
message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment])

async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]:
    chuncks = []
    async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key): 
        chuncks.append(partial.text)
    print(''.join(chuncks))

asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))
I expected the Claude model to read the attached image, but it obviously did not, and returned the following information: "Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image."

I wonder if it is possible to invoke a multi-modal via API, thanks.

Yes I got the same response when using claude model.
While I was checking the latest documentation and api code, I found out that poe has added a new parsed_content field for attachment, I wonder if this would be a way to do it, maybe we can handle the image as a parsed_content style, I'm trying it out, and you can try it too!

Arbow · 2024-04-17T12:22:27Z

I had a similar problem too. I wrote codes like this:
import asyncio
from typing import AsyncIterable
import fastapi_poe as fp
from sse_starlette.sse import ServerSentEvent
from fastapi_poe.types import ContentType, ProtocolMessage, Attachment, PartialResponse

api_key = 'KEY'
prompt = """ Describe the attachment image in detail."""
attachment = Attachment(url="https://pfst.cf2.poecdn.net/base/image/xxxxxxxxxxxxxxxxxxxxxxx?w=1024&h=1024", \
                        content_type="image/png", name="image.png") 
message = fp.ProtocolMessage(role="user", content=prompt, attachments=[attachment])

async def get_bot_response(messages: list[ProtocolMessage], bot_name: str, api_key: str) -> AsyncIterable[PartialResponse | ServerSentEvent]:
    chuncks = []
    async for partial in fp.get_bot_response(messages=[message], bot_name=bot_name, api_key=api_key): 
        chuncks.append(partial.text)
    print(''.join(chuncks))

asyncio.run(get_bot_response([message], 'Claude-3-Sonnet', api_key))
I expected the Claude model to read the attached image, but it obviously did not, and returned the following information: "Unfortunately, you have not actually attached or uploaded any images to our conversation yet. If you do upload an image, I will be happy to describe it in detail for you. Please let me know once you have attached an image."
I wonder if it is possible to invoke a multi-modal via API, thanks.
Yes I got the same response when using claude model. While I was checking the latest documentation and api code, I found out that poe has added a new parsed_content field for attachment, I wonder if this would be a way to do it, maybe we can handle the image as a parsed_content style, I'm trying it out, and you can try it too!

Did you solved this problem？I tried add parsed_content field but useless.

17Reset · 2024-05-20T08:45:24Z

+1

Michalai0 · 2024-07-09T07:17:37Z

+1

qingyanbaby · 2024-08-11T03:04:47Z

+1

qingyanbaby · 2024-08-28T02:15:17Z

@JohntheLi

JohntheLi · 2024-09-11T18:47:27Z

Sorry for the delay on the response. The API is designed such that only attachments uploaded through the UI (Poe Client) are sent to the LLM. Files are processed and linked to the message upon the user uploading them from the client, so attaching arbitrary files from the bot server does not work.

To recap, with the current API, we only support:

attachments in user message (request) that are attached via the Poe client
attachments in bot message (response) that are attached via post_message_attachment as seen here: https://creator.poe.com/docs/server-bots-functional-guides#sending-files-with-your-response

I can already see how this could be a limitation for bot creators, but I am still curious what use cases you all are working on that could benefit from attaching files to the user message via the API?

ZihaoZhou · 2024-09-12T18:26:23Z

I can already see how this could be a limitation for bot creators, but I am still curious what use cases you all are working on that could benefit from attaching files to the user message via the API?

@JohntheLi A typical case is parsing pdf. When user uploads a long pdf, we certainly need to preprocess it into image pages, text pages, running all sorts of different tasks for different parts the intermediate layer extracts on its own. Directly sending the full document to a bot is pointless.

JohntheLi · 2024-09-20T19:59:08Z

Agree with your example that it will be useful to have. I will bring this up with the team.

Keep in mind that there are some complexities with this - this would essentially be linking new attachments to the user message and we need to see how that might break existing product expectations. So I don't think its a small task, but it'll be on our radar. Thanks for reporting it!

ZihaoZhou · 2024-09-20T20:02:13Z

Many thanks. Cannot be more excited to work on some new M-LLM applications.

alfred-liu96 · 2024-11-25T08:45:56Z

I’d love to see this feature too!

JohntheLi mentioned this issue Sep 11, 2024

How to Include a URL and PDF as Attachments in a Protocol Message with fastapi_poe? #113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use api to call a multi-model with local image? #89

How to use api to call a multi-model with local image? #89

HarryZhou-618 commented Apr 12, 2024

Arbow commented Apr 13, 2024

HarryZhou-618 commented Apr 13, 2024

Arbow commented Apr 17, 2024

17Reset commented May 20, 2024

Michalai0 commented Jul 9, 2024

qingyanbaby commented Aug 11, 2024

qingyanbaby commented Aug 28, 2024

JohntheLi commented Sep 11, 2024

ZihaoZhou commented Sep 12, 2024 •

edited

Loading

JohntheLi commented Sep 20, 2024

ZihaoZhou commented Sep 20, 2024

alfred-liu96 commented Nov 25, 2024

How to use api to call a multi-model with local image? #89

How to use api to call a multi-model with local image? #89

Comments

HarryZhou-618 commented Apr 12, 2024

Arbow commented Apr 13, 2024

HarryZhou-618 commented Apr 13, 2024

Arbow commented Apr 17, 2024

17Reset commented May 20, 2024

Michalai0 commented Jul 9, 2024

qingyanbaby commented Aug 11, 2024

qingyanbaby commented Aug 28, 2024

JohntheLi commented Sep 11, 2024

ZihaoZhou commented Sep 12, 2024 • edited Loading

JohntheLi commented Sep 20, 2024

ZihaoZhou commented Sep 20, 2024

alfred-liu96 commented Nov 25, 2024

ZihaoZhou commented Sep 12, 2024 •

edited

Loading