From 856ea295eecea611f8c700c951f851273524c581 Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:19:34 -0700 Subject: [PATCH 1/8] Create RedPajama-3B.md --- docs/RedPajama-3B.md | 64 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 docs/RedPajama-3B.md diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md new file mode 100644 index 0000000..9a15ced --- /dev/null +++ b/docs/RedPajama-3B.md @@ -0,0 +1,64 @@ +# RedPajama-3B + +In order to fine-tune the RedPajama 3B models, please follow these steps: + +First clone the OpenChatKit repo: + +```shell +git clone git@github.com:togethercomputer/OpenChatKit.git +``` + +Next install dependencies as instructed by the OpenChatKit repo. + +# Prepare Weights + +```shell +python pretrained/RedPajama-3B/prepare.py +``` + +This script will download the weight from HuggingFace and prepare it for finetuning. The prepared weights will be saved at + +``` +pretrained/RedPajama-3B/togethercomputer_RedPajama-INCITE-Chat-3B-v1 +``` + +# Prepare Fine Tuning Data + +We now need to preapre the training data. We provide an example script that downloads a small slice of data from OIG. +To download this sample dataset, please run: + +``` +bash data/OIG-chip2/prepare.sh +```` + +The sample dataset will be saved at + +``` +data/OIG-chip2/unified_chip2.jsonl. +``` + +# Run Fine Tuning Script + +We provide an example training script. Please configure the parameters (e.g., learning_rate, batch_size, dataset_path) according to your hardware configuration. +Then to start training, simply run + +``` +bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh +``` + +# Convert to Huggingface Format + +Convert to HF format. The fine-tuned model will be saved to + +``` +model_ckpts/rp-incite-chat-3b-finetuned/checkpoint_{steps} +``` + +In order to use it for inference, you will need to convert it to the HuggingFace format. To do so, run the following script +(as an example, please change the checkpoint path, n-stages and n-layer-per-stage according to the training script): + +``` +python tools/convert_to_hf_gptneox.py --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ --save-path model_ckpts/hf --n-stages 4 --n-layer-per-stage 8 +``` + +Then you are ready to go. From b1804e85d503ec2e7994777736d4260af00b024f Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:37:24 -0700 Subject: [PATCH 2/8] Update docs/RedPajama-3B.md Co-authored-by: Charles Srisuwananukorn <1967608+csris@users.noreply.github.com> --- docs/RedPajama-3B.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md index 9a15ced..fdea492 100644 --- a/docs/RedPajama-3B.md +++ b/docs/RedPajama-3B.md @@ -1,4 +1,4 @@ -# RedPajama-3B +# Fine Tuning RedPajama-INCITE-Base-3B In order to fine-tune the RedPajama 3B models, please follow these steps: From 637c5f1f80736276a2dce461df35e844fcb8e206 Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:37:33 -0700 Subject: [PATCH 3/8] Update docs/RedPajama-3B.md Co-authored-by: Charles Srisuwananukorn <1967608+csris@users.noreply.github.com> --- docs/RedPajama-3B.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md index fdea492..1f4c709 100644 --- a/docs/RedPajama-3B.md +++ b/docs/RedPajama-3B.md @@ -1,6 +1,6 @@ # Fine Tuning RedPajama-INCITE-Base-3B -In order to fine-tune the RedPajama 3B models, please follow these steps: +In order to fine-tune the Fine Tuning RedPajama-INCITE-Base-3B model, please follow these steps: First clone the OpenChatKit repo: From bd1d9668f31b1d7e0c6a8c6ea0ce210a312f8a3f Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:37:41 -0700 Subject: [PATCH 4/8] Update docs/RedPajama-3B.md Co-authored-by: Charles Srisuwananukorn <1967608+csris@users.noreply.github.com> --- docs/RedPajama-3B.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md index 1f4c709..f80de06 100644 --- a/docs/RedPajama-3B.md +++ b/docs/RedPajama-3B.md @@ -24,7 +24,7 @@ pretrained/RedPajama-3B/togethercomputer_RedPajama-INCITE-Chat-3B-v1 # Prepare Fine Tuning Data -We now need to preapre the training data. We provide an example script that downloads a small slice of data from OIG. +We now need to prepare the training data. We provide an example script that downloads a small slice of data from OIG. To download this sample dataset, please run: ``` From a327f053143b7dd4fbe079add4328b5b53160d73 Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:38:45 -0700 Subject: [PATCH 5/8] Update docs/RedPajama-3B.md Co-authored-by: Charles Srisuwananukorn <1967608+csris@users.noreply.github.com> --- docs/RedPajama-3B.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md index f80de06..5bed0c1 100644 --- a/docs/RedPajama-3B.md +++ b/docs/RedPajama-3B.md @@ -34,7 +34,7 @@ bash data/OIG-chip2/prepare.sh The sample dataset will be saved at ``` -data/OIG-chip2/unified_chip2.jsonl. +data/OIG-chip2/unified_chip2.jsonl ``` # Run Fine Tuning Script From 7d7e9bff777cae61d2d312c754f1b7e4020489de Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:38:58 -0700 Subject: [PATCH 6/8] Update docs/RedPajama-3B.md Co-authored-by: Charles Srisuwananukorn <1967608+csris@users.noreply.github.com> --- docs/RedPajama-3B.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md index 5bed0c1..2601e08 100644 --- a/docs/RedPajama-3B.md +++ b/docs/RedPajama-3B.md @@ -58,7 +58,12 @@ In order to use it for inference, you will need to convert it to the HuggingFace (as an example, please change the checkpoint path, n-stages and n-layer-per-stage according to the training script): ``` -python tools/convert_to_hf_gptneox.py --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ --save-path model_ckpts/hf --n-stages 4 --n-layer-per-stage 8 +python tools/convert_to_hf_gptneox.py + --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 + --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ + --save-path model_ckpts/hf + --n-stages 4 + --n-layer-per-stage 8 ``` Then you are ready to go. From 937476af6a39265b3b0ce4ea9935eeb9d592f666 Mon Sep 17 00:00:00 2001 From: Shane Adams Date: Sun, 7 May 2023 23:42:10 -0700 Subject: [PATCH 7/8] various feedback --- docs/finetuning-RedPajama-3B.md | 70 +++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 docs/finetuning-RedPajama-3B.md diff --git a/docs/finetuning-RedPajama-3B.md b/docs/finetuning-RedPajama-3B.md new file mode 100644 index 0000000..0114033 --- /dev/null +++ b/docs/finetuning-RedPajama-3B.md @@ -0,0 +1,70 @@ +# RedPajama-3B + +In this tutorial, you will learn how to fine-tune a base LLM on a sample of data. By the end of +the tutorial, you will have fine-tuned the RedPajama-INCITE-Base-3B model using a sample of +chat data from the OIG dataset. You can adapt this tutorial to fine-tune with your own data. + +In order to fine-tune the RedPajama 3B models, please follow these steps: + +First clone the OpenChatKit repo: + +```shell +git clone git@github.com:togethercomputer/OpenChatKit.git +``` + +Next install dependencies as instructed by the OpenChatKit repo. + +# Prepare Weights + +```shell +python pretrained/RedPajama-3B/prepare.py +``` + +This script will download the weight from HuggingFace and prepare it for finetuning. The prepared weights will be saved at + +``` +pretrained/RedPajama-3B/togethercomputer_RedPajama-INCITE-Chat-3B-v1 +``` + +# Prepare Fine Tuning Data + +We now need to preapre the training data. We provide an example script that downloads a small slice of data from OIG. +To download this sample dataset, please run: + +``` +bash data/OIG-chip2/prepare.sh +```` + +The sample dataset will be saved at + +``` +data/OIG-chip2/unified_chip2.jsonl. +``` + +# Run Fine Tuning Script + +We provide an example training script. Please configure the parameters (e.g., learning_rate, batch_size, dataset_path) according to your hardware configuration. +Then to start training, simply run + +``` +bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh +``` + +# Convert to Huggingface Format + +Convert to HF format. The fine-tuned model will be saved to + +``` +model_ckpts/rp-incite-chat-3b-finetuned/checkpoint_{steps} +``` + +In order to use it for inference, you will need to convert it to the HuggingFace format. To do so, run the following script +(as an example, please change the checkpoint path, n-stages and n-layer-per-stage according to the training script): + +The default for n-stages used in the training script is 10 and the n-layer-per-stage is 8. + +``` +python tools/convert_to_hf_gptneox.py --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ --save-path model_ckpts/hf --n-stages 4 --n-layer-per-stage 8 +``` + +Then you are ready to go. From 86d425adf120e859ce3d7161ecfe0ea4e5a10ea4 Mon Sep 17 00:00:00 2001 From: Xiaozhe Yao Date: Mon, 8 May 2023 07:49:36 +0000 Subject: [PATCH 8/8] update finetuning document --- docs/RedPajama-3B.md | 69 --------------------------------- docs/finetuning-RedPajama-3B.md | 29 ++++++++++++-- 2 files changed, 25 insertions(+), 73 deletions(-) delete mode 100644 docs/RedPajama-3B.md diff --git a/docs/RedPajama-3B.md b/docs/RedPajama-3B.md deleted file mode 100644 index 2601e08..0000000 --- a/docs/RedPajama-3B.md +++ /dev/null @@ -1,69 +0,0 @@ -# Fine Tuning RedPajama-INCITE-Base-3B - -In order to fine-tune the Fine Tuning RedPajama-INCITE-Base-3B model, please follow these steps: - -First clone the OpenChatKit repo: - -```shell -git clone git@github.com:togethercomputer/OpenChatKit.git -``` - -Next install dependencies as instructed by the OpenChatKit repo. - -# Prepare Weights - -```shell -python pretrained/RedPajama-3B/prepare.py -``` - -This script will download the weight from HuggingFace and prepare it for finetuning. The prepared weights will be saved at - -``` -pretrained/RedPajama-3B/togethercomputer_RedPajama-INCITE-Chat-3B-v1 -``` - -# Prepare Fine Tuning Data - -We now need to prepare the training data. We provide an example script that downloads a small slice of data from OIG. -To download this sample dataset, please run: - -``` -bash data/OIG-chip2/prepare.sh -```` - -The sample dataset will be saved at - -``` -data/OIG-chip2/unified_chip2.jsonl -``` - -# Run Fine Tuning Script - -We provide an example training script. Please configure the parameters (e.g., learning_rate, batch_size, dataset_path) according to your hardware configuration. -Then to start training, simply run - -``` -bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh -``` - -# Convert to Huggingface Format - -Convert to HF format. The fine-tuned model will be saved to - -``` -model_ckpts/rp-incite-chat-3b-finetuned/checkpoint_{steps} -``` - -In order to use it for inference, you will need to convert it to the HuggingFace format. To do so, run the following script -(as an example, please change the checkpoint path, n-stages and n-layer-per-stage according to the training script): - -``` -python tools/convert_to_hf_gptneox.py - --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 - --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ - --save-path model_ckpts/hf - --n-stages 4 - --n-layer-per-stage 8 -``` - -Then you are ready to go. diff --git a/docs/finetuning-RedPajama-3B.md b/docs/finetuning-RedPajama-3B.md index 0114033..f91bd28 100644 --- a/docs/finetuning-RedPajama-3B.md +++ b/docs/finetuning-RedPajama-3B.md @@ -1,7 +1,7 @@ # RedPajama-3B In this tutorial, you will learn how to fine-tune a base LLM on a sample of data. By the end of -the tutorial, you will have fine-tuned the RedPajama-INCITE-Base-3B model using a sample of +the tutorial, you will have fine-tuned the RedPajama-INCITE-Chat-3B model using a sample of chat data from the OIG dataset. You can adapt this tutorial to fine-tune with your own data. In order to fine-tune the RedPajama 3B models, please follow these steps: @@ -52,7 +52,7 @@ bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh # Convert to Huggingface Format -Convert to HF format. The fine-tuned model will be saved to +The fine-tuned model will be saved to ``` model_ckpts/rp-incite-chat-3b-finetuned/checkpoint_{steps} @@ -64,7 +64,28 @@ In order to use it for inference, you will need to convert it to the HuggingFace The default for n-stages used in the training script is 10 and the n-layer-per-stage is 8. ``` -python tools/convert_to_hf_gptneox.py --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 --ckpt-path model_ckpts/rp-incite-chat-3b-fintuned/checkpoint_100/ --save-path model_ckpts/hf --n-stages 4 --n-layer-per-stage 8 +python tools/convert_to_hf_gptneox.py --config-name togethercomputer/RedPajama-INCITE-Chat-3B-v1 --ckpt-path model_ckpts/redpajama-incite-chat-3b-sample/checkpoint_10/ --save-path model_ckpts/hf --n-stages 4 --n-layer-per-stage 8 ``` -Then you are ready to go. +Then you are ready to go! You can load the model with HuggingFace and use it for inference, for example: + +```python +import torch +import transformers +from transformers import AutoTokenizer, AutoModelForCausalLM + +tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1") +model = AutoModelForCausalLM.from_pretrained("./model_ckpts/hf", torch_dtype=torch.float16) +model = model.to('cuda:0') + +prompt = ": Who is Alan Turing?\n:" +inputs = tokenizer(prompt, return_tensors='pt').to(model.device) +input_length = inputs.input_ids.shape[1] +outputs = model.generate( + **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True +) +token = outputs.sequences[0, input_length:] +output_str = tokenizer.decode(token) +print(output_str) + +```