Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs should clarify how Claude Desktop treats resources #45

Open
jackdeansmith opened this issue Nov 26, 2024 · 6 comments
Open

Docs should clarify how Claude Desktop treats resources #45

jackdeansmith opened this issue Nov 26, 2024 · 6 comments

Comments

@jackdeansmith
Copy link
Contributor

The docs state:

Resources are designed to be application-controlled, meaning that the client application can decide how and when they should be used.
For example, one application may require users to explicitly select resources, while another could automatically select them based on heuristics or even at the discretion of the AI model itself.

I don't think it's immediately clear how Claude Desktop treats resources. From playing around with it, it seems like it requires explicit user selection at least for now. I think it would be useful to make this explicit in the documentation so servers authors know what to expect when debugging.

@dsp-ant
Copy link
Member

dsp-ant commented Nov 28, 2024

That sounds reasonable. We should make sure it reflects that Claude Desktop is purely an example. There are other MCP clients. Maybe, if you have a chance, would you give it a shot to make that change and send a PR? I'd much appreciate that.

@evalstate
Copy link

evalstate commented Nov 28, 2024

@dsp-ant. My testing so far shows:

  • Prompts and Resources are only available in the first message of a chat, are added to the message as attachments.
  • Resource Templates aren't exposed/understood by the Claude Desktop application.

Is that right? I'm referring to https://modelcontextprotocol.io/clients as well to set my expectations :)

(edit: Tool Results returned as Resources don't seem to pass the URI back to the Model Context; I assume this is a client implementation choice?)

(edit: **all messages can have attachments - I was confused by the UX moving the icon from the left to the right and the slide-out behaviour).

@burningion
Copy link

Just want to comment here and say that Tool Results should be able to pass URI's back to Model Context for inclusion, maybe via another prompt for permission?

@evalstate
Copy link

In the spec, Resource URI's are mandatory in Tool Responses.

https://github.com/modelcontextprotocol/specification/blob/bb5fdd282a4d0793822a569f573ebc36804d38f8/schema/schema.ts#L482-L509

The Client would receive that but it's not specified exactly how the Resource would be presented to the LLM.

The Server can suggest an Audience [ User | Assistant ] and a Priority for each Resource. At the moment I'm interpreting "User" as being "Display to User" and "Assistant" as "Include in Context" with none specified as "Both".

I think a common case may be the Server wanting to return a set of Resource Templates to the Client, so that the User or Assistant can get them on demand if wanted in the next turn. That's obviously not prohibited in the spec.

There's a similar case where the User may wish to upload a binary to send to a tool, for the result to go in context (e.g. Image -> OCR Tool -> Markdown). That's purely a Client implementation question. I'm tempted to tabulate these scenarios/options. There's also what Claude Desktop actually does vs. other Client implementations.

There's a couple of other discussions that seem relevant so I'll paste them here:
https://github.com/orgs/modelcontextprotocol/discussions/54
modelcontextprotocol/specification#90

@burningion
Copy link

Thanks, this is helpful context too.

Now that I've got Resources being returned from my API, I'd ideally have multi-modal instances of my Resource responses.

For example, right now I'm returning Video Files from a search, which are then query able, based upon a separate, analysis pipeline I've built happening in my API.

Ideally I'd be able to pass in a set of embedded images in responses like this:

Screenshot 2024-12-04 at 10 43 06 AM

And show thumbnails inline for each of these detected scenes.

Maybe I'm thinking of a Resource in the wrong way? Or maybe I should include a hierarchy of Resources in my response? IE imply that a set of Resources belong to a single video?

Screenshot 2024-12-04 at 10 45 05 AM

It seems the protocol specifies a single mimetype for each Resource:

Screenshot 2024-12-04 at 10 45 05 AM

@evalstate
Copy link

Claude Desktop as a client has limited support for any resource types beyond images. I'm doing quite a lot of testing here at the moment so will report back with what I learn... but planning to integrate with a different Client soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants