[CI] improve embed testing #18747

noooop · 2025-05-27T06:05:16Z

improve embed testing
add BAAI embed tests
fix test_initialization
rel=MTEB_EMBED_TOL is too strong for testing, suggest using abs=MTEB_EMBED_TOL

e.g.

Snowflake/snowflake-arctic-embed-l

In [1]: import pytest

In [2]: MTEB_EMBED_TOL = 1e-4

In [3]: vllm_main_score = 0.6363612797829713

In [4]: st_main_score = 0.6362746289758823

In [5]: st_main_score == pytest.approx(vllm_main_score, abs=MTEB_EMBED_TOL)
Out[5]: True

In [6]: st_main_score == pytest.approx(vllm_main_score, rel=MTEB_EMBED_TOL)
Out[6]: False

Alibaba-NLP/gte-Qwen2-1.5B-instruct

In [1]: import pytest

In [2]: MTEB_EMBED_TOL = 1e-4

In [3]: vllm_main_score = 0.7583957427443586

In [5]: st_main_score = 0.758473459018872

In [6]: st_main_score == pytest.approx(vllm_main_score, abs=MTEB_EMBED_TOL)
Out[6]: True

In [7]: st_main_score == pytest.approx(vllm_main_score, rel=MTEB_EMBED_TOL)
Out[7]: False

Offline tested all the tests including skiped models

pytest tests/entrypoints/openai/correctness/test_mteb.py

pytest tests/entrypoints/openai/test_embedding.py
pytest tests/entrypoints/openai/test_embedding_dimensions.py

pytest tests/models/language/pooling/test_nomic.py
pytest tests/models/language/pooling/test_snowflake_arctic_embed.py
pytest tests/models/language/pooling/test_jina.py
pytest tests/models/language/pooling/test_gte.py
pytest tests/models/language/pooling/test_baai.py

github-actions · 2025-05-27T06:05:24Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

noooop · 2025-05-27T07:20:42Z

@DarkLight1337

found a serious bug related to nomic context extension.

because we did not use the nomic context extension method.

For nomic-ai/nomic-embed-text-v1, input length greater than 2048 will result nan,
For nomic-embed-text-v2-moe the length is set to default 512.

we need fix it

tests/models/utils.py

vllm/model_executor/models/bert_with_rope.py

noooop · 2025-05-27T07:25:04Z

INFO 05-27 15:16:02 [llm_engine.py:230] Initializing a V0 LLM engine (v0.1.dev6754+g27bebcd.d20250527) with config: model='/home/noooop/.cache/huggingface/hub/nomic-embed-text-v2-moe/', speculative_config=None, tokenizer='/home/noooop/.cache/huggingface/hub/nomic-embed-text-v2-moe/'

....

max_seq_len=512

....

WARNING 05-27 15:16:03 [bert_with_rope.py:541] We did not use the nomic context extension method, current max_model_len is 2048. The context extension uses vllm style rope_theta and rope_scaling.

Because initializing config before config_verify, logs might be confusing.

vllm/model_executor/models/bert_with_rope.py

noooop · 2025-05-28T04:18:21Z

@DarkLight1337

QvQ

need this pr to fix models/test_initialization.py::test_can_initialize[NomicBertModel]

https://buildkite.com/vllm/ci/builds/20918#0197149b-bcec-4890-960e-4699d62c1daf

Here max_model_len is set to the original length, but Nomic context extension is disabled, so it can't be set to the original length.

        LLM(
            model_info.default,
            tokenizer=model_info.tokenizer,
            tokenizer_mode=model_info.tokenizer_mode,
            speculative_config={
                "model": model_info.speculative_model,
                "num_speculative_tokens": 1,
            } if model_info.speculative_model else None,
            trust_remote_code=model_info.trust_remote_code,
            max_model_len=model_info.max_model_len,        <- set max_model_len 
            load_format="dummy",
            hf_overrides=hf_overrides,
        )

nomic-ai/nomic-embed-text-v2-moe can pass the test

DarkLight1337 · 2025-05-28T07:16:22Z

Merging this first to fix basic models test. We can fix models/language/pooling/test_embedding.py::test_models[half-ssmits/Qwen2-7B-Instruct-embed-base] in another PR such as #18720

cc @Isotr0py

Signed-off-by: amit <amit.man@gmail.com>

noooop added 2 commits May 27, 2025 12:01

test

c9dce1f

rel->abs

d14f40f

noooop requested review from DarkLight1337, robertgshaw2-redhat, simon-mo and ywang96 as code owners May 27, 2025 06:05

noooop added 6 commits May 27, 2025 14:14

fix

ef8c62b

fix

f025868

fix nomic max_model_len

26a0488

fix nomic max_model_len

844f665

fix

8674c17

fix

7b08025

DarkLight1337 reviewed May 27, 2025

View reviewed changes

tests/models/utils.py Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

noooop added 3 commits May 27, 2025 15:48

fix

5397c3a

Revert fix nomic max_model_len

8e4cfa6

fix

253c6e3

noooop mentioned this pull request May 28, 2025

[Bugfix] Fix nomic max_model_len #18755

Merged

Merge branch 'vllm-project:main' into embed

80fd515

noooop marked this pull request as draft May 28, 2025 04:10

fix can_initialize

b818118

noooop marked this pull request as ready for review May 28, 2025 04:17

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label May 28, 2025

get_and_verify_max_len

1d4859d

vllm-bot merged commit de65fc8 into vllm-project:main May 28, 2025
57 of 63 checks passed

noooop mentioned this pull request May 30, 2025

[Core] hybrid dtype for Pooling Models: float32 for weights and activation, float16 or bfloat16 for attention. #18940

Closed

amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025

[CI] improve embed testing (vllm-project#18747)

99955db

Signed-off-by: amit <amit.man@gmail.com>

noooop deleted the embed branch July 10, 2025 04:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI] improve embed testing #18747

[CI] improve embed testing #18747

Uh oh!

noooop commented May 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

noooop commented May 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

noooop commented May 28, 2025 •

edited

Loading

Uh oh!

DarkLight1337 commented May 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[CI] improve embed testing #18747

[CI] improve embed testing #18747

Uh oh!

Conversation

noooop commented May 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

noooop commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

noooop commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkLight1337 commented May 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

noooop commented May 27, 2025 •

edited by github-actions bot

Loading

noooop commented May 27, 2025 •

edited

Loading

noooop commented May 28, 2025 •

edited

Loading