Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.5.1 Release Tracker #5806

Closed
simon-mo opened this issue Jun 25, 2024 · 15 comments
Closed

v0.5.1 Release Tracker #5806

simon-mo opened this issue Jun 25, 2024 · 15 comments
Assignees
Labels
release Related to new version release

Comments

@simon-mo simon-mo added the misc label Jun 25, 2024
@simon-mo simon-mo self-assigned this Jun 25, 2024
@simon-mo simon-mo added release Related to new version release and removed misc labels Jun 25, 2024
@NiuBlibing
Copy link
Contributor

Could you release nightly versions for esaier testing?

@DarkLight1337
Copy link
Member

DarkLight1337 commented Jun 26, 2024

For multi-modal support, we plan to only include new VLMs (#4986 is user-facing while #5591 is intended to be a component of other VLMs) and #5214 (which involves dev-facing changes) in this release. The other upcoming PRs such as #5276 introduce a sequence of breaking changes to users, so we will try to bundle them within a single major release (e.g. v0.6) to avoid continuously interrupting users.

@WangErXiao
Copy link

Deepseek-V2 this can be merged in v0.5.1?

@simon-mo
Copy link
Collaborator Author

Nightly is in the Q3 roadmap for CI/CD.

@sasha0552
Copy link
Contributor

Hi. I haven't seen this release tracker as it's not pinned, but could #4409 be included in the release? At the moment, at least ~10 users want Pascal support in vLLM.

#4409 (comment)
#5224 (comment)
https://github.com/sasha0552/vllm-ci/stargazers

@DarkLight1337
Copy link
Member

DarkLight1337 commented Jul 2, 2024

For multi-modal support, we plan to only include new VLMs (#4986 is user-facing while #5591 is intended to be a component of other VLMs) and #5214 (which involves dev-facing changes) in this release. The other upcoming PRs such as #5276 introduce a sequence of breaking changes to users, so we will try to bundle them within a single major release (e.g. v0.6) to avoid continuously interrupting users.

Since the release has been delayed, to avoid soft blocking other PRs from getting merged we have included those PRs in the release anyway. The expected user-facing breaking changes are:

  • Simplified engine args: Image-specific arguments have been removed from all entrypoints as we found them unnecessary.
  • Simplified interface for multimodal inputs: This affects usage of the LLM API class. On the other hand, the OpenAI-compatible server handles the conversion internally so end users remain unaffected.
    • No more repeating <image> tokens in the prompt - please follow the format documented on the HuggingFace repo ([Core] Dynamic image size support for VLMs #5276)
       # e.g. LLaVA-1.5 (llava-hf/llava-1.5-7b-hf)
       llm.generate({
      -    "prompt": "<image>" * 576 + "\nUSER: What is the content of this image?\nASSISTANT:",
      +    "prompt": "USER: <image>\nWhat is the content of this image?\nASSISTANT:",
           "multi_modal_data": multi_modal_data,
       })
    • Instead of passing ImagePixelData(pil_image), you should pass {"image": pil_image} to multimodal prompts ([VLM] Remove image_input_type from VLM config #5852)
       llm.generate({
           "prompt": prompt,
      -    "multi_modal_data": ImagePixelData(pil_image),
      +    "multi_modal_data": {"image": pil_image},
       })
    • ImagePixelData(tensor) and ImageFeatureData are no longer supported ([VLM] Remove image_input_type from VLM config #5852)
      • If you are currently using ImageFeatureData to represent multi-image inputs, please refrain from upgrading since we are going to replace it with embeddings soon (see below).
    • We will support multi-modal embeddings in an upcoming PR to be included in the next release. Expect the format to be along the lines of:
       llm.generate({
           "prompt": prompt,
      -    "multi_modal_data": ImageFeatureData(feature_tensor),
      +    "multi_modal_data": {"image": {"embeds": model.multi_modal_projector(feature_tensor)}},  # Or just pass the embeddings directly
       })
          ```

@DarkLight1337
Copy link
Member

@simon-mo btw this thread is not pinned

@WangErXiao
Copy link

#5358 can this be merged in v0.5.1?

@DarkLight1337
Copy link
Member

DarkLight1337 commented Jul 2, 2024

#5358 can this be merged in v0.5.1?

Very unlikely this will happen since the author of the PR has not resolved the merge conflicts yet. This is not to mention #5852 and #5276 (scheduled to merge before v0.5.1) will introduce further merge conflicts.

@WoosukKwon
Copy link
Collaborator

Please add #6051 for Gemma 2

@njhill
Copy link
Member

njhill commented Jul 2, 2024

Small fix #6079 is ready and would be good to include if possible.

@ywang96
Copy link
Member

ywang96 commented Jul 3, 2024

Please also add #6089 - I plan to merge it by noon as this is the final piece we need for the cycle's milestone for multi-modality support refactoring and is a user-facing change we need to add in for this release.

Update: #6089 is merged!

@DarkLight1337
Copy link
Member

It would be nice if we can get #5979 into the release, otherwise we won't see its effects until the next release after this one...

@huangchen007
Copy link

Will this release be bumped today exactly?

@simon-mo
Copy link
Collaborator Author

simon-mo commented Jul 5, 2024

Cutting now, ETA today.

https://www.githubstatus.com/incidents/5yx1d67vq9hg GH incident can't trigger CI :(. Will wait and retry.

@simon-mo simon-mo closed this as completed Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Related to new version release
Projects
None yet
Development

No branches or pull requests

9 participants