[Feature] Support for Inference with LoRA Adapter #847

kamillle · 2024-07-31T08:33:44Z

Motivation

By using multiple LoRA adapters, we can expect to achieve various behaviors within a single inference server. This can potentially reduce the number of servers needed to deploy inference servers, leading to cost savings. From a training perspective, since there is no need to fine-tune the entire model, we can iterate through experimental cycles more quickly.

Related resources

vllm

zhyncs · 2024-07-31T08:39:43Z

Hi @kamillle Thank you for your attention and valuable suggestions. Support for LoRA is in our roadmap, please stay tuned. #634

kamillle · 2024-07-31T08:50:55Z

@zhyncs I'm looking forward it!! Thank you.

kamillle closed this as completed Jul 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support for Inference with LoRA Adapter #847

[Feature] Support for Inference with LoRA Adapter #847

kamillle commented Jul 31, 2024

zhyncs commented Jul 31, 2024 •

edited

Loading

kamillle commented Jul 31, 2024

[Feature] Support for Inference with LoRA Adapter #847

[Feature] Support for Inference with LoRA Adapter #847

Comments

kamillle commented Jul 31, 2024

Motivation

Related resources

zhyncs commented Jul 31, 2024 • edited Loading

kamillle commented Jul 31, 2024

zhyncs commented Jul 31, 2024 •

edited

Loading