Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a ReFT notebook to the tutorials section #741

Merged
merged 6 commits into from
Oct 13, 2024

Conversation

julian-fong
Copy link
Contributor

This PR aims to add a tutorial notebook to utilize the Loreft adapter to fine-tune roberta-base on the mnli dataset.

Reviews appreciated!

Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall, thanks for working on this! Left some mostly minor comments to be adressed before we merge :)

Additionally, could you add this new notebook to the list in the README of the notebooks folder, thanks!

"\n",
"In this tutorial, we will be demonstrating how to fine-tune a language model using [Representation Finetuning for Language Models](https://arxiv.org/abs/2404.03592)\n",
"\n",
"We will use a traditional large language model and focus on fine tuning via ReFT adapters rather than the traditional full model fine tuning.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure "traditional large language model" is a good description here. maybe something like "lightweight encoder model" is a good description for Roberta nowadays.

"source": [
"### Model and Adapter initialization\n",
"\n",
"We load the `roberta-base` model along with the `LoReftConfig`. We can initalize a `reft` config with only one line of code, and can add it to our base model using the `add_adapter` function. On top of that, we can add a classification head to our adapter specifying 3 labels.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice to link our docs on ReFT here: https://docs.adapterhub.ml/methods.html#reft and mention that there is explanation on supported config parameters.

" learning_rate=6e-4,\n",
" per_device_train_batch_size=32,\n",
" per_device_eval_batch_size=32,\n",
" num_train_epochs=2,\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make a note that usually you'd train longer and this is only for demo purposes.

Comment on lines 382 to 384
" input_ids = tokenizer(text, truncation=True, padding='max_length')\n",
" input_ids[\"input_ids\"] = torch.tensor(input_ids[\"input_ids\"])\n",
" input_ids[\"attention_mask\"] = torch.tensor(input_ids[\"attention_mask\"])\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add return_tensors="pt" (same as in preprocess_function) to tokenizer call so you don't need to convert to tensors afterwards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, good catch!

@julian-fong
Copy link
Contributor Author

Looks great overall, thanks for working on this! Left some mostly minor comments to be adressed before we merge :)

Additionally, could you add this new notebook to the list in the README of the notebooks folder, thanks!

Sounds good, should the whisper notebook be added as well? I dont see it in the read.me

@TimoImhof
Copy link
Contributor

Looks great overall, thanks for working on this! Left some mostly minor comments to be adressed before we merge :)
Additionally, could you add this new notebook to the list in the README of the notebooks folder, thanks!

Sounds good, should the whisper notebook be added as well? I dont see it in the read.me

Yes please, we forgot that 👍

Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for including the feedback!

@calpt calpt merged commit 6fefc9a into adapter-hub:main Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants