Skip to content

Conversation

@noooop
Copy link
Collaborator

@noooop noooop commented May 27, 2025

  1. improve embed testing
  2. add BAAI embed tests
  3. fix test_initialization
  4. rel=MTEB_EMBED_TOL is too strong for testing, suggest using abs=MTEB_EMBED_TOL

e.g.

Snowflake/snowflake-arctic-embed-l

In [1]: import pytest

In [2]: MTEB_EMBED_TOL = 1e-4

In [3]: vllm_main_score = 0.6363612797829713

In [4]: st_main_score = 0.6362746289758823

In [5]: st_main_score == pytest.approx(vllm_main_score, abs=MTEB_EMBED_TOL)
Out[5]: True

In [6]: st_main_score == pytest.approx(vllm_main_score, rel=MTEB_EMBED_TOL)
Out[6]: False

Alibaba-NLP/gte-Qwen2-1.5B-instruct

In [1]: import pytest

In [2]: MTEB_EMBED_TOL = 1e-4

In [3]: vllm_main_score = 0.7583957427443586

In [5]: st_main_score = 0.758473459018872

In [6]: st_main_score == pytest.approx(vllm_main_score, abs=MTEB_EMBED_TOL)
Out[6]: True

In [7]: st_main_score == pytest.approx(vllm_main_score, rel=MTEB_EMBED_TOL)
Out[7]: False

Offline tested all the tests including skiped models

pytest tests/entrypoints/openai/correctness/test_mteb.py

pytest tests/entrypoints/openai/test_embedding.py
pytest tests/entrypoints/openai/test_embedding_dimensions.py

pytest tests/models/language/pooling/test_nomic.py
pytest tests/models/language/pooling/test_snowflake_arctic_embed.py
pytest tests/models/language/pooling/test_jina.py
pytest tests/models/language/pooling/test_gte.py
pytest tests/models/language/pooling/test_baai.py

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@noooop
Copy link
Collaborator Author

noooop commented May 27, 2025

@DarkLight1337

found a serious bug related to nomic context extension.

because we did not use the nomic context extension method.

For nomic-ai/nomic-embed-text-v1, input length greater than 2048 will result nan,
For nomic-embed-text-v2-moe the length is set to default 512.

we need fix it

@noooop
Copy link
Collaborator Author

noooop commented May 27, 2025

INFO 05-27 15:16:02 [llm_engine.py:230] Initializing a V0 LLM engine (v0.1.dev6754+g27bebcd.d20250527) with config: model='/home/noooop/.cache/huggingface/hub/nomic-embed-text-v2-moe/', speculative_config=None, tokenizer='/home/noooop/.cache/huggingface/hub/nomic-embed-text-v2-moe/'

....

max_seq_len=512

....

WARNING 05-27 15:16:03 [bert_with_rope.py:541] We did not use the nomic context extension method, current max_model_len is 2048. The context extension uses vllm style rope_theta and rope_scaling.

Because initializing config before config_verify, logs might be confusing.

@noooop noooop marked this pull request as draft May 28, 2025 04:10
@noooop noooop marked this pull request as ready for review May 28, 2025 04:17
@noooop
Copy link
Collaborator Author

noooop commented May 28, 2025

@DarkLight1337

QvQ

need this pr to fix models/test_initialization.py::test_can_initialize[NomicBertModel]

https://buildkite.com/vllm/ci/builds/20918#0197149b-bcec-4890-960e-4699d62c1daf

Here max_model_len is set to the original length, but Nomic context extension is disabled, so it can't be set to the original length.

        LLM(
            model_info.default,
            tokenizer=model_info.tokenizer,
            tokenizer_mode=model_info.tokenizer_mode,
            speculative_config={
                "model": model_info.speculative_model,
                "num_speculative_tokens": 1,
            } if model_info.speculative_model else None,
            trust_remote_code=model_info.trust_remote_code,
            max_model_len=model_info.max_model_len,        <- set max_model_len 
            load_format="dummy",
            hf_overrides=hf_overrides,
        )

nomic-ai/nomic-embed-text-v2-moe can pass the test

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label May 28, 2025
@DarkLight1337
Copy link
Member

Merging this first to fix basic models test. We can fix models/language/pooling/test_embedding.py::test_models[half-ssmits/Qwen2-7B-Instruct-embed-base] in another PR such as #18720

cc @Isotr0py

@vllm-bot vllm-bot merged commit de65fc8 into vllm-project:main May 28, 2025
57 of 63 checks passed
amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025
Signed-off-by: amit <amit.man@gmail.com>
@noooop noooop deleted the embed branch July 10, 2025 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants