Replies: 5 comments 13 replies
-
Do you refer to https://huggingface.co/docs/autotrain/index ? I haven't tried it, but autotrain seems to be hosted e2e solution for training/finetuning various classes of models. It is most definitely more powerful/faster/paid per hour. slowllama is a standalone tool/library which can be used locally on somewhat slower devices (e.g. mac mini). The original use-case I was thinking about was:
|
Beta Was this translation helpful? Give feedback.
-
Oh sorry I forgot to update the instructions in one place - now configuration is in conf* files rather than passed as an argument, please try running it like |
Beta Was this translation helpful? Give feedback.
-
Thank you, now command runs. But for you finetuning of 20 iteration is happening in 20 mins and for me it is going on for an hour and failing but output trained models for few iteration are available in out directory. slowllama % python3 finetune.py slowllama % ls -al out/ Plus these trained models are not giving expected outcome, i trained 7B model: slowllama % python3 test_gen.py ./out/state_dict_10.pth
Please help. |
Beta Was this translation helpful? Give feedback.
-
I've just tried training with following overrides: config:
test_gen:
I ran finetune for 20 iterations, but used the checkpoint 10 (where we are less likely to overfit):
This was the output: Cubestat reports the following metrics:
|
Beta Was this translation helpful? Give feedback.
-
Thank you, will this methods also works to finetune llama2 with lots of questions and answer database about a topic? |
Beta Was this translation helpful? Give feedback.
-
is:open Slowllama is quite interesting. Plz elaborate what slowllama does which autotrain cant't ?
Beta Was this translation helpful? Give feedback.
All reactions