modeling_llama_fastv_kvcache.py #21

cool-xiang · 2024-07-15T09:51:12Z

Thank you for your wonderful work! I noticed this code: modeling_llama_fastv_kvcache.py, This is for saving the full kv-caches of the visual token. Can this code run now? I see a few todos in the code line.

Thank you very much!

chenllliang · 2024-07-26T14:23:02Z

Hi, this part of code is discarded and not tested, we move to the huggingface's llava to implement kv-cache. For kv-cache, please refer to this comment for more details.

chenllliang self-assigned this Jul 16, 2024

cool-xiang closed this as completed Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modeling_llama_fastv_kvcache.py #21

modeling_llama_fastv_kvcache.py #21

cool-xiang commented Jul 15, 2024

chenllliang commented Jul 26, 2024 •

edited

Loading

modeling_llama_fastv_kvcache.py #21

modeling_llama_fastv_kvcache.py #21

Comments

cool-xiang commented Jul 15, 2024

chenllliang commented Jul 26, 2024 • edited Loading

chenllliang commented Jul 26, 2024 •

edited

Loading