[Bugfix] add input embedding #11684

Bryce1010 · 2025-01-02T11:01:37Z

adds support for passing prompt_embeds to LLM.generate as

llm.generate({"prompt_embeds": input_embeds}, sampling_params)

or

llm.generate(
    [{"prompt_embeds": input_embeds} for input_embeds in inputs_embeds], sampling_params
)

this enables use cases when only the embedding layer is finetuned, and have the same model backend support multiple custom tuned embedding layers

FIX #416
FIX #8323
FIX #14621

github-actions · 2025-01-02T11:01:50Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2025-01-02T11:20:05Z

Thanks for opening this PR, can you explain what this PR is about and how this is related to #11375?

Bryce1010 · 2025-01-02T11:53:48Z

Thanks for opening this PR, can you explain what this PR is about and how this is related to #11375?

@DarkLight1337 Sorry for referencing the incorrect issue number. Please refer to the following issues for
#416
#8323

DarkLight1337 · 2025-01-02T12:15:41Z

There is a related PR #6869, but your solution can sidestep the problem of missing input token IDs (at the cost of being less efficient).

Can you update this PR with the unit tests in #6869 to see whether this solution works correctly?

vllm/model_executor/models/qwen2.py

mergify · 2025-02-08T09:20:27Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Bryce1010.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

OswaldoBornemann · 2025-02-11T09:06:53Z

@Bryce1010 Hi, when I used your branch, I ran the following case

import torch
import numpy as np

test_prompts = [
    "Who are you",
]

test_embeds = torch.randn(len(test_prompts), 10, 892).to(dtype=torch.float32)  # [batch_size, seq_len, hidden_size]
llm.generate(
    {"prompt_embeds":test_embeds.to(device)}
)

But I came out of the error like RuntimeError: Error in model execution (input dumped to /tmp/err_execute_model_input_20250211-170512.pkl): expected scalar type Float but found BFloat16

OswaldoBornemann · 2025-02-11T09:22:39Z

@Bryce1010 I changed the test_embeds to test_embeds = torch.randn(len(test_prompts), 10, 896).to(dtype=torch.bfloat16) # [batch_size, seq_len, hidden_size] .

Now the error is AssertionError: Error in model execution (input dumped to /tmp/err_execute_model_input_20250211-172100.pkl):

qthequartermasterman · 2025-02-18T21:24:43Z

@Bryce1010 @DarkLight1337 Is there an expected time for when this will be updated/merged?

sidhartha-roy · 2025-02-19T17:19:28Z

I would love to see this feature merged. Is there a timeline for when you expect this to be reviewed and merged @njhill and @alexm-redhat ?

Bryce1010 · 2025-03-06T02:38:03Z

@Bryce1010 do tell me when it's ready for another round of review. Seems that many people are interested in this.

@DarkLight1337 Thanks for your response! The code has been rebased. Could you please review it and let me know if any further changes are needed before merging into the main branch?

DarkLight1337 · 2025-03-06T06:58:53Z

Can you fix the pre-commit errors?

DarkLight1337 · 2025-03-08T09:37:19Z

There are still errors, please fix them

mergify · 2025-03-11T15:03:35Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Bryce1010.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Reek12138 · 2025-03-14T07:19:16Z

I would like to ask when this feature will be merged into the main branch. Once merged, will it support Python 3.8?

DarkLight1337 · 2025-03-14T07:31:20Z

Support for Python 3.8 in general has been dropped a while ago as it has reached EOL.

liangwythu · 2025-03-19T08:10:33Z

Would love to see this one merged!

qthequartermasterman · 2025-03-25T02:07:07Z

@Bryce1010 @DarkLight1337 Since it has been a few weeks without an update, and I would love to have this feature, I took the initiative to rebase this PR onto main and fix the pre-commit errors as previously asked. I can't commit to this branch, since it's owned by @Bryce1010, so these updates are in #15428.

DarkLight1337 · 2025-03-25T03:05:21Z

@Bryce1010 @DarkLight1337 Since it has been a few weeks without an update, and I would love to have this feature, I took the initiative to rebase this PR onto main and fix the pre-commit errors as previously asked. I can't commit to this branch, since it's owned by @Bryce1010, so these updates are in #15428.

That sounds good, thanks for picking this up!

leonardtang · 2025-03-25T05:05:14Z

@Bryce1010 @DarkLight1337 Since it has been a few weeks without an update, and I would love to have this feature, I took the initiative to rebase this PR onto main and fix the pre-commit errors as previously asked. I can't commit to this branch, since it's owned by @Bryce1010, so these updates are in #15428.

My goat

DarkLight1337 · 2025-04-28T07:43:23Z

Closing in favor of #15428

Bryce1010 requested review from alexm-redhat, comaniac, njhill, youkaichao and zhuohan123 as code owners January 2, 2025 11:01

mergify bot added the frontend label Jan 2, 2025

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch 5 times, most recently from 0429875 to 8c7751d Compare January 2, 2025 11:17

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch 2 times, most recently from 5963444 to f70bbb3 Compare January 6, 2025 11:06

DarkLight1337 reviewed Jan 6, 2025

View reviewed changes

vllm/model_executor/models/qwen2.py Outdated Show resolved Hide resolved

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch 4 times, most recently from 5c82b7a to 868a730 Compare January 8, 2025 06:29

mergify bot added the needs-rebase label Feb 8, 2025

mergify bot removed the needs-rebase label Feb 20, 2025

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch from 209e8ae to 868a730 Compare February 20, 2025 12:15

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch from 868a730 to cd8fcf8 Compare March 6, 2025 02:35

mergify bot removed the needs-rebase label Mar 6, 2025

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch 3 times, most recently from 3fdc9b5 to 3451391 Compare March 6, 2025 03:24

临景 and others added 2 commits March 6, 2025 15:05

(vllm) add input embedding

8bcec12

improve embedding input

dd93d89

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch 3 times, most recently from 6f480cb to c51e4a1 Compare March 6, 2025 07:09

(vllm) fix import error

fa8caec

Bryce1010 force-pushed the feature/vllm/add-input-embedding branch from c51e4a1 to fa8caec Compare March 6, 2025 07:14

(vllm) fix pre commit error

def7f49

DarkLight1337 mentioned this pull request Mar 11, 2025

[Usage]: how to use embeddings as input rather than token_ids #14621

Closed

1 task

mergify bot added the needs-rebase label Mar 11, 2025

qthequartermasterman mentioned this pull request Mar 25, 2025

[Core] [Bugfix] Add Input Embeddings #15428

Merged

DarkLight1337 mentioned this pull request Apr 9, 2025

[Model][VLM] Add Qwen2.5-Omni model support (end-to-end full support) #16347

Draft

DarkLight1337 closed this Apr 28, 2025

DarkLight1337 mentioned this pull request Jul 31, 2025

[Feature] Skip modules for disabled modalities #21943

Open

1 task

Uh oh!

[Bugfix] add input embedding #11684

[Bugfix] add input embedding #11684

Uh oh!

Conversation

Bryce1010 commented Jan 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 2, 2025

Uh oh!

DarkLight1337 commented Jan 2, 2025

Uh oh!

Bryce1010 commented Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Feb 8, 2025

Uh oh!

OswaldoBornemann commented Feb 11, 2025

Uh oh!

OswaldoBornemann commented Feb 11, 2025

Uh oh!

qthequartermasterman commented Feb 18, 2025

Uh oh!

sidhartha-roy commented Feb 19, 2025

Uh oh!

Bryce1010 commented Mar 6, 2025

Uh oh!

DarkLight1337 commented Mar 6, 2025

Uh oh!

DarkLight1337 commented Mar 8, 2025

Uh oh!

mergify bot commented Mar 11, 2025

Uh oh!

Reek12138 commented Mar 14, 2025

Uh oh!

DarkLight1337 commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liangwythu commented Mar 19, 2025

Uh oh!

qthequartermasterman commented Mar 25, 2025

Uh oh!

DarkLight1337 commented Mar 25, 2025

Uh oh!

leonardtang commented Mar 25, 2025

Uh oh!

DarkLight1337 commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Bryce1010 commented Jan 2, 2025 •

edited by github-actions bot

Loading

Bryce1010 commented Jan 2, 2025 •

edited

Loading

DarkLight1337 commented Jan 2, 2025 •

edited

Loading

DarkLight1337 commented Mar 14, 2025 •

edited

Loading