- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.8k
[MODEL ADDITION] Ovis2 Model Addition #15826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run  Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add  🚀 | 
5859227    to
    96f0c10      
    Compare
  
    96f0c10    to
    1976b0d      
    Compare
  
    | @Isotr0py I have a couple of question, there are some files like 
 | 
| I think you can create a  | 
| ok cool and also since they use the Qwen2 image preprocessor and I can't modify it to accept new kwargs, how would you expose the mm_kward to the image_processor call, @JumpingRain how have you implemented it? | 
| I've added the config and the processor classes to the modeling fle since they are not present in transformers | 
| Seems that there are some changes have been in main already. Can you try to base the PR off main branch? | 
| @mlinmg @Isotr0py Hello, I'm currently using the following method to set max_partition as an initialization parameter for OvisProcessor: class OvisProcessor(ProcessorMixin):
    attributes = ["image_processor", "tokenizer"]
    valid_kwargs = ["chat_template"]
    image_processor_class = "AutoImageProcessor"
    tokenizer_class = ("Qwen2Tokenizer", "Qwen2TokenizerFast")
    def __init__(self, image_processor=None, tokenizer=None, chat_template=None, **kwargs):
        self.image_token = "<|image_pad|>" if not hasattr(tokenizer, "image_token") else tokenizer.image_token
        self.video_token = "<|video_pad|>" if not hasattr(tokenizer, "video_token") else tokenizer.video_token
        self.max_partition = kwargs.get('max_partition', 9)
        self.covering_threshold = kwargs.get('covering_threshold', 0.9)
        self.convert_to_rgb = kwargs.get('convert_to_rgb', True)
        self.return_tensors = kwargs.get('return_tensors', 'pt')
        super().__init__(image_processor, tokenizer, chat_template=chat_template, **kwargs)
    
    def preprocess_image(self, image: PIL.Image.Image, max_partition=None, covering_threshold=None, convert_to_rgb=None, return_tensors=None):
        max_partition = max_partition if max_partition is not None else self.max_partition
        covering_threshold = covering_threshold if covering_threshold is not None else self.covering_threshold
        convert_to_rgb = convert_to_rgb if convert_to_rgb is not None else self.convert_to_rgb
        return_tensors = return_tensors if return_tensors is not None else self.return_tensors
        # other code | 
| Thanks for the valuable contributions to Ovis. I kindly suggest the possibility of using 'ovis2' or 'Ovis2' for the model_type or coding to help differentiate it from previous versions like Ovis, Ovis1.5, and Ovis1.6. This approach would also facilitate future versioning, such as Ovis2.X or Ovis3. | 
| 
 I've modified it but you'll need to also modify the configuration/processin/ auto config section file to have the corrected naming (Ovis2) | 
| This pull request has merge conflicts that must be resolved before it can be | 
| You should create a different HF repo with modified files, following the instructions on my pull request in the Ovis repo.
Marco | 
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com>
23476d8    to
    64985c1      
    Compare
  
    | @mlinmg @Isotr0py Thank you for your outstanding work; it seems that Ovis is on track to smoothly integrate with Vllm! Is there anything I can assist with on my end? Additionally, you previously mentioned the need to modify the Ovis Hugging Face files. Considering that after the release, we aim for the weights to be compatible with historical code, we hope to achieve the Ovis HF file modification with minimal changes. In my local tests, I found that this can be accomplished by modifying the config.json and tokenizer config. Could you please specify which parts of the Ovis HF code need to be modified to support Vllm usage? I can make the necessary adjustments quickly. | 
| 
 Currently, this PR require to use a modified tokenizer on HF ( | 
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
| This pull request has merge conflicts that must be resolved before it can be | 
Signed-off-by: Isotr0py <2037008807@qq.com>
| eagerly waiting for this support guys @Isotr0py @DarkLight1337 , let me know if i can help in any way | 
Signed-off-by: Isotr0py <2037008807@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests can still pass on my side locally, let's put this forward!
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com> Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
FIX #13251
FIX #13317
FIX #13441
FIX #14346
With this PR I want to add the ovis architecture to VLLM continuing the discussion at AIDC-AI/Ovis#70