Skip to content

Conversation

@uyzhang
Copy link
Contributor

@uyzhang uyzhang commented Oct 16, 2025

Purpose

Support Bee-8B-RL and Bee-8B-SFT

Example Serving Command

vllm serve \
    Open-Bee/Bee-8B-RL \
    --served-model-name bee-8b-rl \
    --tensor-parallel-size 8 \
    --gpu-memory-utilization 0.8 \
    --host 0.0.0.0 \
    --port 8000 \
    --trust-remote-code

Example Offline Inference

import os
from transformers import AutoProcessor
from vllm import LLM, SamplingParams
from PIL import Image
import requests
from io import BytesIO


def load_image(image_path):
    """Load image from URL or local path"""
    if image_path.startswith(('http://', 'https://')):
        response = requests.get(image_path, timeout=10)
        response.raise_for_status()
        image = Image.open(BytesIO(response.content))
    else:
        image = Image.open(image_path)

    # Convert RGBA to RGB if needed
    if image.mode == "RGBA":
        background = Image.new('RGB', image.size, (255, 255, 255))
        background.paste(image, mask=image.split()[-1])
        image = background

    return image.convert("RGB")


def main():

    model_path = "Open-Bee/Bee-8B-RL"

    llm = LLM(
        model=model_path,
        limit_mm_per_prompt={"image": 5},
        trust_remote_code=True,
        tensor_parallel_size=1,
        gpu_memory_utilization=0.8,
    )

    sampling_params = SamplingParams(
        temperature=0.8,
        max_tokens=16384,
    )

    image_url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    image = load_image(image_url)
    text = "Describe this image."

    messages = [
        {
            "role":
            "user",
            "content": [
                {
                    "type": "image",
                    "image": image
                },
                {
                    "type": "text",
                    "text": text
                },
            ],
        },
    ]

    processor = AutoProcessor.from_pretrained(model_path,
                                              trust_remote_code=True)
    prompt = processor.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
    )

    mm_data = {"image": image}
    llm_inputs = {
        "prompt": prompt,
        "multi_modal_data": mm_data,
    }

    outputs = llm.generate([llm_inputs], sampling_params=sampling_params)
    generated_text = outputs[0].outputs[0].text

    print(generated_text)


if __name__ == '__main__':
    main()

uyzhang and others added 6 commits October 16, 2025 18:17
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
@mergify
Copy link

mergify bot commented Oct 16, 2025

Documentation preview: https://vllm--27012.org.readthedocs.build/en/27012/

@mergify mergify bot added documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models labels Oct 16, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Bee-8B model. The implementation correctly integrates the model into vLLM by inheriting from existing LLaVA-like model classes and providing a model-specific multimodal projector and processing logic. The changes also include documentation updates, examples, and tests. I've found one critical compatibility issue in the implementation that needs to be addressed.

uyzhang and others added 2 commits October 16, 2025 18:24
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Copy link
Member

@ywang96 ywang96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! LGTM

@ywang96 ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 17, 2025
@ywang96 ywang96 enabled auto-merge (squash) October 17, 2025 07:20
@DarkLight1337 DarkLight1337 added this to the v0.11.1 milestone Oct 17, 2025
@ywang96 ywang96 disabled auto-merge October 19, 2025 21:55
@ywang96 ywang96 enabled auto-merge (squash) October 19, 2025 21:55
@ywang96 ywang96 merged commit f32bf75 into vllm-project:main Oct 20, 2025
55 checks passed
Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Oct 20, 2025
Merged 8 commits from origin/main including:
- PR vllm-project#26586: Eagle rejection sampler fix (previously cherry-picked)
- LoRA CUDA graph specialization (vllm-project#25914)
- Bee-8B VLM model support (vllm-project#27012)
- Utilities reorganization (network_utils, async_utils, etc.)
- Multiple bug fixes and improvements

In-Tree Modifications:
- Removed Eagle rejection sampler cherry-pick (now in upstream)
- Kept Qwen3 tool parser fix (still needed, line 523)
- Only 1 active in-tree modification remaining

Plugin Compatibility:
- All 10 plugin patches load successfully
- No target class changes required
- Clean merge with no conflicts

Documentation Updates:
- Updated IN_TREE_MODIFICATIONS.md (moved Eagle fix to Removed/Obsolete)
- Updated CLAUDE.md merge history
- Verified clean diff with origin/main (3 files, all documented)

Signed-off-by: Pradyun Ramadorai <pradyunr@amazon.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
adabeyta pushed a commit to adabeyta/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants