-
Notifications
You must be signed in to change notification settings - Fork 66
Enable multimodal support + qwen2.5-vl #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable multimodal support + qwen2.5-vl #92
Conversation
attafosu
commented
Aug 20, 2025
- Enables v1 multmodal support
- Enables qwen2.5-vl: Support for MRope
Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
* Style formatting Signed-off-by: attafosu <thomas.atta-fosu@intel.com> * Extra mops Signed-off-by: attafosu <thomas.atta-fosu@intel.com> * appease yapf and ruff Signed-off-by: attafosu <thomas.atta-fosu@intel.com> --------- Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
5de633f to
d8a23f5
Compare
|
/run-gaudi-tests |
|
Only codeowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, mswiniarsk, adobrzyn |
Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
| token_ids = _async_h2d_tensor(token_ids, torch.int32) | ||
| token_positions = _async_h2d_tensor(token_positions, torch.int32) | ||
| if not self.uses_mrope: | ||
| token_positions = _async_h2d_tensor(token_positions, torch.int32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we not hard_code tensor as HPU in mrope_token_positions = self._align_and_pad_mrope_positions()
So we don't need to add condition for if do H2D for token_positions here?
| token_ids_device = _async_h2d_tensor_copy(token_ids, self.device) | ||
| positions_device = _async_h2d_tensor_copy(positions, self.device) | ||
| positions_device = input_mrope_positions if self.uses_mrope \ | ||
| else _async_h2d_tensor_copy(positions, self.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same suggestion for input_mrope_positions , let's keep the original logic that firstly done on 'cpu' and use _async_h2d_tensor_copy to convert
Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
| mrope_position_tensor = torch.full(out_shape, | ||
| padding_gen, | ||
| dtype=torch.int32, | ||
| device='hpu') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right, I assume we will init as cpu tensor firstly and use async_h2d function to convert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
|
|
||
| def _async_h2d_tensor(data, dtype, device='hpu'): | ||
| if isinstance(data, torch.Tensor): | ||
| return data.to(device=device, dtype=dtype, non_blocking=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary, and removed.
Signed-off-by: attafosu <thomas.atta-fosu@intel.com>
- Enables v1 multmodal support - Enables qwen2.5-vl: Support for MRope --------- Signed-off-by: attafosu <thomas.atta-fosu@intel.com> Signed-off-by: Marcin Swiniarski <marcin.swiniarski@intel.com>