Skip to content

Commit

Permalink
Examples: add options to save or push model (huggingface#1159)
Browse files Browse the repository at this point in the history
  • Loading branch information
callanwu authored and BenjaminBossan committed Nov 30, 2023
1 parent 570e992 commit 00dc6f4
Show file tree
Hide file tree
Showing 6 changed files with 125 additions and 20 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,23 @@
"print(next(iter(test_dataloader)))"
]
},
{
"cell_type": "markdown",
"id": "42b14a11",
"metadata": {},
"source": [
"You can load model from hub or local\n",
"\n",
"- Load model from Hugging Face Hub, you can change to your own model id\n",
"```python\n",
"peft_model_id = \"username/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM\"\n",
"```\n",
"- Or load model form local\n",
"```python\n",
"peft_model_id = \"twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM\"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 5,
Expand Down Expand Up @@ -244,7 +261,6 @@
"\n",
"max_memory = {0: \"1GIB\", 1: \"1GIB\", 2: \"2GIB\", 3: \"10GIB\", \"cpu\": \"30GB\"}\n",
"peft_model_id = \"smangrul/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM\"\n",
"\n",
"config = PeftConfig.from_pretrained(peft_model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, device_map=\"auto\", max_memory=max_memory)\n",
"model = PeftModel.from_pretrained(model, peft_model_id, device_map=\"auto\", max_memory=max_memory)"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -349,12 +349,21 @@ def test_preprocess_function(examples):
pred_df.to_csv(f"data/{dataset_name}/predictions.csv", index=False)

accelerator.wait_for_everyone()
model.push_to_hub(
"smangrul/"
+ f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
state_dict=accelerator.get_state_dict(model),
use_auth_token=True,
# Option1: Pushing the model to Hugging Face Hub
# model.push_to_hub(
# f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
# token = "hf_..."
# )
# token (`bool` or `str`, *optional*):
# `token` is to be used for HTTP Bearer authorization when accessing remote files. If `True`, will use the token generated
# when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
# is not specified.
# Or you can get your token from https://huggingface.co/settings/token
# Option2: Saving the model locally
peft_model_id = f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace(
"/", "_"
)
model.save_pretrained(peft_model_id)
accelerator.wait_for_everyone()


Expand Down
35 changes: 33 additions & 2 deletions examples/causal_language_modeling/peft_prefix_tuning_clm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1228,6 +1228,33 @@
" print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True))"
]
},
{
"cell_type": "markdown",
"id": "0e21c49b",
"metadata": {},
"source": [
"You can push model to hub or save model locally. \n",
"\n",
"- Option1: Pushing the model to Hugging Face Hub\n",
"```python\n",
"model.push_to_hub(\n",
" f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\"/\", \"_\"),\n",
" token = \"hf_...\"\n",
")\n",
"```\n",
"token (`bool` or `str`, *optional*):\n",
" `token` is to be used for HTTP Bearer authorization when accessing remote files. If `True`, will use the token generated\n",
" when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`\n",
" is not specified.\n",
" Or you can get your token from https://huggingface.co/settings/token\n",
"```\n",
"- Or save model locally\n",
"```python\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\"/\", \"_\")\n",
"model.save_pretrained(peft_model_id)\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 16,
Expand All @@ -1236,7 +1263,9 @@
"outputs": [],
"source": [
"# saving model\n",
"peft_model_id = f\"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\"\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\n",
" \"/\", \"_\"\n",
")\n",
"model.save_pretrained(peft_model_id)"
]
},
Expand All @@ -1260,7 +1289,9 @@
"source": [
"from peft import PeftModel, PeftConfig\n",
"\n",
"peft_model_id = f\"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\"\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\n",
" \"/\", \"_\"\n",
")\n",
"\n",
"config = PeftConfig.from_pretrained(peft_model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)\n",
Expand Down
35 changes: 33 additions & 2 deletions examples/causal_language_modeling/peft_prompt_tuning_clm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1072,6 +1072,33 @@
" print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True))"
]
},
{
"cell_type": "markdown",
"id": "c8f35152",
"metadata": {},
"source": [
"You can push model to hub or save model locally. \n",
"\n",
"- Option1: Pushing the model to Hugging Face Hub\n",
"```python\n",
"model.push_to_hub(\n",
" f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\"/\", \"_\"),\n",
" token = \"hf_...\"\n",
")\n",
"```\n",
"token (`bool` or `str`, *optional*):\n",
" `token` is to be used for HTTP Bearer authorization when accessing remote files. If `True`, will use the token generated\n",
" when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`\n",
" is not specified.\n",
" Or you can get your token from https://huggingface.co/settings/token\n",
"```\n",
"- Or save model locally\n",
"```python\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\"/\", \"_\")\n",
"model.save_pretrained(peft_model_id)\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 12,
Expand All @@ -1080,7 +1107,9 @@
"outputs": [],
"source": [
"# saving model\n",
"peft_model_id = f\"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\"\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\n",
" \"/\", \"_\"\n",
")\n",
"model.save_pretrained(peft_model_id)"
]
},
Expand Down Expand Up @@ -1116,7 +1145,9 @@
"source": [
"from peft import PeftModel, PeftConfig\n",
"\n",
"peft_model_id = f\"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\"\n",
"peft_model_id = f\"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}\".replace(\n",
" \"/\", \"_\"\n",
")\n",
"\n",
"config = PeftConfig.from_pretrained(peft_model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -298,12 +298,22 @@ def collate_fn(examples):
pred_df.to_csv(f"data/{dataset_name}/predictions.csv", index=False)

accelerator.wait_for_everyone()
model.push_to_hub(
"smangrul/"
+ f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
state_dict=accelerator.get_state_dict(model),
use_auth_token=True,
# Option1: Pushing the model to Hugging Face Hub
# model.push_to_hub(
# f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
# token = "hf_..."
# )
# token (`bool` or `str`, *optional*):
# `token` is to be used for HTTP Bearer authorization when accessing remote files. If `True`, will use the token generated
# when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
# is not specified.
# Or you can get your token from https://huggingface.co/settings/token

# Option2: Saving the model locally
peft_model_id = f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace(
"/", "_"
)
model.save_pretrained(peft_model_id)
accelerator.wait_for_everyone()


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,11 +125,19 @@ def preprocess_function(examples):
accelerator.print(f"{eval_preds[:10]=}")
accelerator.print(f"{dataset['validation'][label_column][:10]=}")
accelerator.wait_for_everyone()
model.push_to_hub(
"smangrul/" + f"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
state_dict=accelerator.get_state_dict(model),
use_auth_token=True,
)
# Option1: Pushing the model to Hugging Face Hub
# model.push_to_hub(
# f"{dataset_name}_{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_"),
# token = "hf_..."
# )
# token (`bool` or `str`, *optional*):
# `token` is to be used for HTTP Bearer authorization when accessing remote files. If `True`, will use the token generated
# when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
# is not specified.
# Or you can get your token from https://huggingface.co/settings/token
# Option2: Saving the model locally
peft_model_id = f"{model_name_or_path}_{peft_config.peft_type}_{peft_config.task_type}".replace("/", "_")
model.save_pretrained(peft_model_id)
accelerator.wait_for_everyone()


Expand Down

0 comments on commit 00dc6f4

Please sign in to comment.