Comparison of Different Fine-Tuning Techniques for Conversational AI #2310

ImamaDev · 2025-01-07T07:07:50Z

Feature request

It would be incredibly helpful to have a clear comparison or support for various fine-tuning techniques specifically for conversational AI. This feature could include insights into their strengths, limitations, and ideal use cases, helping practitioners choose the right approach for their needs.

Here’s a list of techniques to consider:

LoRa
AdaLoRa
BONE
VeRa
XLora
LN Tuning
VbLora
HRA (Hyperparameter Regularization Adapter)
IA3 (Input-Aware Adapter)
Llama Adapter
CPT (Conditional Prompt Tuning)etc

Motivation

With the growing number of fine-tuning techniques for conversational AI, it can be challenging to identify the most suitable approach for specific use cases. A comprehensive comparison of these techniques—highlighting their strengths, limitations, and ideal scenarios—would save time, reduce trial-and-error, and empower users to make informed decisions. This feature would bridge the gap between research and practical application, enabling more effective model customization and deployment.

Your contribution

I’d be happy to collaborate on this! While I might not have a complete solution right now, I’m willing to contribute by gathering resources, reviewing papers, or helping organize comparisons. If others are interested in teaming up, we could work together on a PR to make this feature happen. Let’s connect and brainstorm how we can tackle this effectively!

BenjaminBossan · 2025-01-07T10:12:14Z

Thanks for coming up with this proposal. Indeed, this is something we have on our backlog for a long time. As you can imagine, providing objective and useful information on this is a huge undertaking, since relying on the paper results can often be problematic.

As a long term project, we plan to provide some kind of benchmark that compares all these methods in terms of runtime, memory usage, performance, etc. but I can't give any concrete date yet.

In the meantime, we have started to be more rigorous when new methods are being added in requiring a clear description of what the best use cases are. There is still a lot of room for improvement, especially when it comes to methods that were added some time ago.

If you (and others) want to contribute, I think a good place to start would be to go through the individual methods in the PEFT docs and help improve the descriptions. If we can make them more uniform, with more details on the best uses cases, pros and cons, this would already be a nice improvement.

There are other places that could benefit from such a clean up, e.g. the description of all the LoRA initialization methods.

sparsh2 · 2025-01-07T22:06:54Z

I would be interested to contribute as well

BenjaminBossan · 2025-01-08T09:44:53Z

I would be interested to contribute as well

Thanks for the offer. As mentioned, as a first step, we could use some help with updating the "blurbs" of the PEFT methods. For this, it's often sufficient to read a couple of section from the paper. If anyone wants to work on one such method, please announce it here so that there is no duplicate work.

BenjaminBossan added good first issue Good for newcomers help wanted Extra attention is needed contributions-welcome labels Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison of Different Fine-Tuning Techniques for Conversational AI #2310

Comparison of Different Fine-Tuning Techniques for Conversational AI #2310

ImamaDev commented Jan 7, 2025

BenjaminBossan commented Jan 7, 2025

sparsh2 commented Jan 7, 2025

BenjaminBossan commented Jan 8, 2025

Comparison of Different Fine-Tuning Techniques for Conversational AI #2310

Comparison of Different Fine-Tuning Techniques for Conversational AI #2310

Comments

ImamaDev commented Jan 7, 2025

Feature request

Motivation

Your contribution

BenjaminBossan commented Jan 7, 2025

sparsh2 commented Jan 7, 2025

BenjaminBossan commented Jan 8, 2025