Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modeling code and nearby utilities should work with "raw" bytes not URLs #244

Merged
merged 4 commits into from
Dec 17, 2024

Conversation

ashwinb
Copy link
Contributor

@ashwinb ashwinb commented Dec 16, 2024

What this PR does

This is a long-pending change and particularly important to get done now.

Specifically:

  • we cannot "localize" (aka download) any URLs from media attachments anywhere near our modeling code. it must be done upwards in the stack or in other utilities
  • the PIL.Image is infesting all our APIs and that cannot be right at all. we need a standard { type: "image", image_url: "<...>" } which is more extensible
  • this essentially argues for separating Model-related Message types (don't have too much structure here) and the API-level Message types. As a result of this PR, UserMessage, etc. are moved completely to llama-stack.

This PR will have a substantial accompanying PR in llama-stack as well.

Test Plan

Ran the sole pytest test for model running:

TEXT_MODEL_CHECKPOINT_DIR=~/.llama/checkpoints/Llama3.2-3B-Instruct \
  PYTHONPATH=. \
  pytest models/llama3/tests/api/test_generation.py

Ran the example scripts:

PYTHONPATH=. \
   torchrun \
   llama_models/scripts/multimodal_example_chat_completion.py ~/.llama/checkpoints/Llama3.2-11B-Vision-Instruct

Modeling code or code close to it (chat_format.py specifically) should not be thinking of downloading
URLs, etc. Especially not doing it randomly on-demand.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 16, 2024
content: InterleavedTextMedia
stop_reason: StopReason
tool_calls: List[ToolCall] = Field(default_factory=list)
class RawMessage(BaseModel):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RawMessage and RawContent types are the most important bits in this PR. Everything is downstream of these changes. See RawContentItem as well above.

ashwinb added a commit to meta-llama/llama-stack that referenced this pull request Dec 17, 2024
## What does this PR do?

This is a long-pending change and particularly important to get done
now.

Specifically:
- we cannot "localize" (aka download) any URLs from media attachments
anywhere near our modeling code. it must be done within llama-stack.
- `PIL.Image` is infesting all our APIs via `ImageMedia ->
InterleavedTextMedia` and that cannot be right at all. Anything in the
API surface must be "naturally serializable". We need a standard `{
type: "image", image_url: "<...>" }` which is more extensible
- `UserMessage`, `SystemMessage`, etc. are moved completely to
llama-stack from the llama-models repository.

See meta-llama/llama-models#244 for the
corresponding PR in llama-models.

## Test Plan

```bash
cd llama_stack/providers/tests

pytest -s -v -k "fireworks or ollama or together" inference/test_vision_inference.py
pytest -s -v -k "(fireworks or ollama or together) and llama_3b" inference/test_text_inference.py
pytest -s -v -k chroma memory/test_memory.py \
  --env EMBEDDING_DIMENSION=384 --env CHROMA_DB_PATH=/tmp/foobar

pytest -s -v -k fireworks agents/test_agents.py  \
   --safety-shield=meta-llama/Llama-Guard-3-8B \
   --inference-model=meta-llama/Llama-3.1-8B-Instruct
```

Updated the client sdk (see PR ...), installed the SDK in the same
environment and then ran the SDK tests:

```bash
cd tests/client-sdk
LLAMA_STACK_CONFIG=together pytest -s -v agents/test_agents.py
LLAMA_STACK_CONFIG=ollama pytest -s -v memory/test_memory.py

# this one needed a bit of hacking in the run.yaml to ensure I could register the vision model correctly
INFERENCE_MODEL=llama3.2-vision:latest LLAMA_STACK_CONFIG=ollama pytest -s -v inference/test_inference.py
```
@ashwinb ashwinb merged commit bf5b0c4 into main Dec 17, 2024
1 check passed
@ashwinb ashwinb deleted the no_urls_in_models branch December 17, 2024 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants