Skip to content

Conversation

quic-dhirajku
Copy link
Contributor

Updated the run_vlm_kv_model_on_pytorch and run_vlm_kv_model_on_ort methods to run for the latest dual QPC setup. Along with the required changes to be made in the Input Handler of VLMs.

Also updated the way head_dim is calculated for past_key_value creation as certain models now provide specific head_dim. We fallback to previous method if the parameter isn't found in the config.

…ethods to run for the latest dual QPC setup. Along with the required changes to be made in the Input Handler of VLMs.

Also updated the way head_dim is calculated for past_key_value creation as certain models now provide specific head_dim. We fallback to previous method if the parameter isn't found in the config.

Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant