-
Notifications
You must be signed in to change notification settings - Fork 935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modeling code and nearby utilities should work with "raw" bytes not URLs #244
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Modeling code or code close to it (chat_format.py specifically) should not be thinking of downloading URLs, etc. Especially not doing it randomly on-demand.
ashwinb
requested review from
yanxi0830,
hardikjshah,
dltn and
raghotham
as code owners
December 16, 2024 00:43
facebook-github-bot
added
the
CLA Signed
This label is managed by the Meta Open Source bot.
label
Dec 16, 2024
ashwinb
commented
Dec 16, 2024
content: InterleavedTextMedia | ||
stop_reason: StopReason | ||
tool_calls: List[ToolCall] = Field(default_factory=list) | ||
class RawMessage(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RawMessage
and RawContent
types are the most important bits in this PR. Everything is downstream of these changes. See RawContentItem
as well above.
ashwinb
force-pushed
the
no_urls_in_models
branch
from
December 16, 2024 02:44
12d6b8e
to
a4c98d4
Compare
yanxi0830
approved these changes
Dec 17, 2024
ashwinb
added a commit
to meta-llama/llama-stack
that referenced
this pull request
Dec 17, 2024
## What does this PR do? This is a long-pending change and particularly important to get done now. Specifically: - we cannot "localize" (aka download) any URLs from media attachments anywhere near our modeling code. it must be done within llama-stack. - `PIL.Image` is infesting all our APIs via `ImageMedia -> InterleavedTextMedia` and that cannot be right at all. Anything in the API surface must be "naturally serializable". We need a standard `{ type: "image", image_url: "<...>" }` which is more extensible - `UserMessage`, `SystemMessage`, etc. are moved completely to llama-stack from the llama-models repository. See meta-llama/llama-models#244 for the corresponding PR in llama-models. ## Test Plan ```bash cd llama_stack/providers/tests pytest -s -v -k "fireworks or ollama or together" inference/test_vision_inference.py pytest -s -v -k "(fireworks or ollama or together) and llama_3b" inference/test_text_inference.py pytest -s -v -k chroma memory/test_memory.py \ --env EMBEDDING_DIMENSION=384 --env CHROMA_DB_PATH=/tmp/foobar pytest -s -v -k fireworks agents/test_agents.py \ --safety-shield=meta-llama/Llama-Guard-3-8B \ --inference-model=meta-llama/Llama-3.1-8B-Instruct ``` Updated the client sdk (see PR ...), installed the SDK in the same environment and then ran the SDK tests: ```bash cd tests/client-sdk LLAMA_STACK_CONFIG=together pytest -s -v agents/test_agents.py LLAMA_STACK_CONFIG=ollama pytest -s -v memory/test_memory.py # this one needed a bit of hacking in the run.yaml to ensure I could register the vision model correctly INFERENCE_MODEL=llama3.2-vision:latest LLAMA_STACK_CONFIG=ollama pytest -s -v inference/test_inference.py ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does
This is a long-pending change and particularly important to get done now.
Specifically:
{ type: "image", image_url: "<...>" }
which is more extensibleUserMessage
, etc. are moved completely tollama-stack
.This PR will have a substantial accompanying PR in llama-stack as well.
Test Plan
Ran the sole pytest test for model running:
Ran the example scripts: