-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[CI/Build] Fix multimodal tests #22491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
46d55ee to
52ac13f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to fix an issue in multimodal tests where different cache configurations were causing problems. The introduction of a globally cached _get_processor_cache function is a good approach. However, the implementation within MultiModalRegistry._get_processor_cache still incorrectly caches the processor cache at the instance level. This can lead to a stale cache being used when the model configuration changes. I've provided a critical comment with a suggested fix to ensure the correct cache is always retrieved.
vllm/multimodal/registry.py
Outdated
| def _get_processor_cache(self, model_config: "ModelConfig"): | ||
| capacity_gb = model_config.mm_processor_cache_gb | ||
| if capacity_gb is None: | ||
| return None # Overrides `disable_cache` argument | ||
|
|
||
| if self._processor_cache is None: | ||
| self._processor_cache = ProcessingCache(capacity_gb) | ||
| model_id = model_config.model | ||
| capacity_gb = model_config.mm_processor_cache_gb | ||
| self._processor_cache = _get_processor_cache(model_id, capacity_gb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of _get_processor_cache has a flaw. The check if self._processor_cache is None: causes the cache to be created only once and then reused for all subsequent calls, regardless of whether the model_config (and thus the required cache capacity) has changed. This can lead to using a cache with an incorrect size or using a cache when it should be disabled (e.g., when capacity_gb is 0), which can cause difficult-to-debug errors.
The new global _get_processor_cache function is correctly memoized with lru_cache, so it's safe and efficient to call it on every invocation of this method. This ensures the correct cache is used based on the current model_config.
| def _get_processor_cache(self, model_config: "ModelConfig"): | |
| capacity_gb = model_config.mm_processor_cache_gb | |
| if capacity_gb is None: | |
| return None # Overrides `disable_cache` argument | |
| if self._processor_cache is None: | |
| self._processor_cache = ProcessingCache(capacity_gb) | |
| model_id = model_config.model | |
| capacity_gb = model_config.mm_processor_cache_gb | |
| self._processor_cache = _get_processor_cache(model_id, capacity_gb) | |
| capacity_gb = model_config.mm_processor_cache_gb | |
| self._processor_cache = _get_processor_cache(capacity_gb) | |
| return self._processor_cache |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This reverts commit cf8aab7.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Noam Gat <noamgat@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Paul Pak <paulpak58@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Diego-Castan <diego.castan@ibm.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
Fix failing tests caused by #22441
The reason is that the tests use different model configs (some with cache size set to 0 and others with the normal cache size) which require different caches to be created for each setup.
Test Plan
Test Result
(Optional) Documentation Update