[WiP] Fixing kv cache injection for LlaMa and Mistral #2244

dbogunowicz · 2024-04-16T13:24:23Z

No description provided.

dbogunowicz · 2024-04-22T10:48:42Z

@abhinavnmagic can I have reviews and testing?

abhinavnmagic · 2024-04-22T23:39:44Z

Does this PR fix ONNX export for quantized or just pruned or both? I will test accordingly.

dbogunowicz · 2024-04-23T11:33:35Z

@abhinavnmagic for all the llama models, both quant and non-quant

jeanniefinks · 2025-05-09T18:52:04Z

Per the main README announcement, SparseML is being deprecated by June 2, 2025. Closing the PR as work has been suspended; thank you for the inputs and support!

dbogunowicz and others added 2 commits April 16, 2024 13:23

i think i fixed llama

67895be

Merge branch 'main' into feature/damian/fixing_injection

c40ca1e

Merge branch 'main' into feature/damian/fixing_injection

5c79a5e

Merge branch 'main' into feature/damian/fixing_injection

37b7a96

jeanniefinks closed this May 9, 2025

jeanniefinks deleted the feature/damian/fixing_injection branch May 29, 2025 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WiP] Fixing kv cache injection for LlaMa and Mistral #2244

[WiP] Fixing kv cache injection for LlaMa and Mistral #2244

dbogunowicz commented Apr 16, 2024

Uh oh!

dbogunowicz commented Apr 22, 2024

Uh oh!

abhinavnmagic commented Apr 22, 2024

Uh oh!

dbogunowicz commented Apr 23, 2024

Uh oh!

jeanniefinks commented May 9, 2025

Uh oh!

[WiP] Fixing kv cache injection for LlaMa and Mistral #2244

[WiP] Fixing kv cache injection for LlaMa and Mistral #2244

Conversation

dbogunowicz commented Apr 16, 2024

Uh oh!

dbogunowicz commented Apr 22, 2024

Uh oh!

abhinavnmagic commented Apr 22, 2024

Uh oh!

dbogunowicz commented Apr 23, 2024

Uh oh!

jeanniefinks commented May 9, 2025

Uh oh!