Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement NEFTune Augmentation for Improved Language Model Fine-tuning #718

Closed
5 tasks done
kostum123 opened this issue Oct 12, 2023 · 4 comments
Closed
5 tasks done
Labels
enhancement New feature or request

Comments

@kostum123
Copy link

kostum123 commented Oct 12, 2023

⚠️ Please check that this feature request hasn't been suggested before.

  • I searched previous Ideas in Discussions didn't find any similar feature requests.
  • I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

I would like to propose the addition of a new method called NEFTune to the Axolotl repository. NEFTune is a recent and highly effective augmentation technique for language model fine-tuning that has shown remarkable improvements in performance across various tasks without increasing training time significantly.

Motivation:

Currently, Axolotl relies on qlora, lora, and full fine-tuning methods. While these methods are effective, integrating NEFTune into the repository can further enhance the model's capabilities. NEFTune operates by adding noise to the embedding vectors during training, leading to substantial performance gains.

Benefits:

  1. Performance Boost: Incorporating NEFTune can significantly improve the performance of the models in various evaluations, such as AlpacaEval.

  2. Compatibility: NEFTune is designed to be compatible with existing fine-tuning methods, including qlora, lora, and full fine-tuning, ensuring ease of implementation.

  3. Generalization: NEFTune has demonstrated the ability to improve model performance on a range of modern instruction datasets, including Evol-Instruct, ShareGPT, and OpenPlatypus.

  4. Versatility: Even powerful models that undergo further refinement, such as those using RLHF like LLaMA-2-Chat, can benefit from the addition of NEFTune during training.

Paper Reference:

NEFTune: Improved Language Model Fine-Tuning with Noisy Embeddings

✔️ Solution

Official Repo of the Method:
https://github.com/neelsjain/neftune

Request Implementation Steps:

  1. Evaluate the feasibility of integrating NEFTune into the Axolotl repository.

  2. Implement NEFTune as an optional augmentation method in the training pipeline.

  3. Provide clear documentation and guidelines for users on how to utilize NEFTune within the Axolotl framework.

  4. Conduct thorough testing and validation to ensure that the integration does not negatively impact existing functionalities.

  5. Monitor and maintain the NEFTune feature to keep it up-to-date with any changes in the Axolotl repository.

Additional Information:

I believe that incorporating NEFTune into the Axolotl repository can elevate the capabilities of the models and make them more competitive in various natural language understanding tasks. It aligns with the goal of continuous improvement and innovation in language model training.

I appreciate your consideration of this feature request and look forward to the potential enhancement of the Axolotl repository's capabilities.

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

  • My issue title is concise, descriptive, and in title casing.
  • I have searched the existing issues to make sure this feature has not been requested yet.
  • I have provided enough information for the maintainers to understand and evaluate this request.
@kostum123 kostum123 added the enhancement New feature or request label Oct 12, 2023
@winglian
Copy link
Collaborator

probably just needs to monkeypatch it on this https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L787

@philpax
Copy link
Contributor

philpax commented Nov 13, 2023

Was this closed by #721 ?

@enn-nafnlaus
Copy link

And if it's implemented, how do we use it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants