From e8f66b8a425eced6c592089d40b8d33d82c2b2f0 Mon Sep 17 00:00:00 2001 From: Maria Khalusova Date: Wed, 26 Apr 2023 09:03:40 -0400 Subject: [PATCH] [docs] Supported models tables (#364) * supported models moved to index * minor fix * Update docs/source/index.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply feedback * Update docs/source/index.mdx Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> --- docs/source/index.mdx | 95 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 2 deletions(-) diff --git a/docs/source/index.mdx b/docs/source/index.mdx index 008be12f0b..6c7a994d3b 100644 --- a/docs/source/index.mdx +++ b/docs/source/index.mdx @@ -16,11 +16,102 @@ specific language governing permissions and limitations under the License. PEFT methods only fine-tune a small number of (extra) model parameters, significantly decreasing computational and storage costs because fine-tuning large-scale PLMs is prohibitively costly. Recent state-of-the-art PEFT techniques achieve performance comparable to that of full fine-tuning. -PEFT is seamlessly integrated with 🤗 Accelerate for large-scale models leveraging DeepSpeed and [Big Model Inference](https://huggingface.co/docs/accelerate/usage_guides/big_modeling). +PEFT is seamlessly integrated with 🤗 Accelerate for large-scale models leveraging DeepSpeed and [Big Model Inference](https://huggingface.co/docs/accelerate/usage_guides/big_modeling). -Supported methods include: +If you are new to PEFT, get started by reading the [Quicktour](quicktour) guide and conceptual guides for [LoRA](/conceptual_guides/lora) and [Prompting](/conceptual_guides/prompting) methods. + +## Supported methods 1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685.pdf) 2. Prefix Tuning: [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://aclanthology.org/2021.acl-long.353/), [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf) 3. P-Tuning: [GPT Understands, Too](https://arxiv.org/pdf/2103.10385.pdf) 4. Prompt Tuning: [The Power of Scale for Parameter-Efficient Prompt Tuning](https://arxiv.org/pdf/2104.08691.pdf) +5. AdaLoRA: [Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning](https://arxiv.org/abs/2303.10512) +6. [LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention](https://github.com/ZrrSkywalker/LLaMA-Adapter) +## Supported models + +The tables provided below list the PEFT methods and models supported for each task. To apply a particular PEFT method for +a task, please refer to the corresponding Task guides. + +### Causal Language Modeling + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +|--------------| ---- | ---- | ---- | ---- | +| GPT-2 | ✅ | ✅ | ✅ | ✅ | +| Bloom | ✅ | ✅ | ✅ | ✅ | +| OPT | ✅ | ✅ | ✅ | ✅ | +| GPT-Neo | ✅ | ✅ | ✅ | ✅ | +| GPT-J | ✅ | ✅ | ✅ | ✅ | +| GPT-NeoX-20B | ✅ | ✅ | ✅ | ✅ | +| LLaMA | ✅ | ✅ | ✅ | ✅ | +| ChatGLM | ✅ | ✅ | ✅ | ✅ | + +### Conditional Generation + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| T5 | ✅ | ✅ | ✅ | ✅ | +| BART | ✅ | ✅ | ✅ | ✅ | + +### Sequence Classification + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| BERT | ✅ | ✅ | ✅ | ✅ | +| RoBERTa | ✅ | ✅ | ✅ | ✅ | +| GPT-2 | ✅ | ✅ | ✅ | ✅ | +| Bloom | ✅ | ✅ | ✅ | ✅ | +| OPT | ✅ | ✅ | ✅ | ✅ | +| GPT-Neo | ✅ | ✅ | ✅ | ✅ | +| GPT-J | ✅ | ✅ | ✅ | ✅ | +| Deberta | ✅ | | ✅ | ✅ | +| Deberta-v2 | ✅ | | ✅ | ✅ | + +### Token Classification + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| BERT | ✅ | ✅ | | | +| RoBERTa | ✅ | ✅ | | | +| GPT-2 | ✅ | ✅ | | | +| Bloom | ✅ | ✅ | | | +| OPT | ✅ | ✅ | | | +| GPT-Neo | ✅ | ✅ | | | +| GPT-J | ✅ | ✅ | | | +| Deberta | ✅ | | | | +| Deberta-v2 | ✅ | | | | + +### Text-to-Image Generation + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| Stable Diffusion | ✅ | | | | + + +### Image Classification + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| ViT | ✅ | | | | +| Swin | ✅ | | | | + +### Image to text (Multi-modal models) + +We have tested LoRA for [ViT](https://huggingface.co/docs/transformers/model_doc/vit) and [Swin](https://huggingface.co/docs/transformers/model_doc/swin) for fine-tuning on image classification. +However, it should be possible to use LoRA for any [ViT-based model](https://huggingface.co/models?pipeline_tag=image-classification&sort=downloads&search=vit) from 🤗 Transformers. +Check out the [Image classification](/task_guides/image_classification_lora) task guide to learn more. If you run into problems, please open an issue. + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| Blip-2 | ✅ | | | | + + +### Semantic Segmentation + +As with image-to-text models, you should be able to apply LoRA to any of the [segmentation models](https://huggingface.co/models?pipeline_tag=image-segmentation&sort=downloads). +It's worth noting that we haven't tested this with every architecture yet. Therefore, if you come across any issues, kindly create an issue report. + +| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | +| --------- | ---- | ---- | ---- | ---- | +| SegFormer | ✅ | | | | +