-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I get a MT-bench evaluation code for reproduction of acceleration? #36
Comments
MT bench scripts uploaded. See applications/run_mtbench.sh for examples. |
llama2-7b-10-24.json
|
hi @peoplekillerS , I just ran the script to reproduce your problem. However, I did not observe your situation. This is an answer file I just obtained. I was wondering which version of code you are using and if you have modified the code, as this is an uncommon situation. |
Thanks to the author for the reply!!! Because I am using llama2-7b-chat-hf for testing, I am a little late in replying to you, sorry! In fact, I got the same results as you, but you can take a closer look at the json file you sent me above. If you search for 'Provide a variety of craft', you will find that basically every line has this answer. , is this a normal phenomenon? -------------------------------------------------- (Dividing line)---------------------------------------------- ---------------I just use the code of the main branch. The question of mt_bench is downloaded from the link of your run_mtbench.sh file. No code has been changed. The command is the one in the picture above. Commands, differences: 1. Use llama2-7b-chat instead of llama2-7b-chat-hf. 2. The --use-pp parameter is 1, because every time I set it to 0, An error will be reported: NotImplementedError: Cannot copy out of meta tensor; no data! (I am using RTX 3090) The above is the difference. I would like to ask the author if he would consider creating an eval_mtbench version of llama2-7b-chat?And do you know how to solve the problem that occurs when --use pp is set to 0? |
It is a normal phenonmenon to have the same line in every answer. We use fastchat to generate a conversation template and this line https://github.com/lm-sys/FastChat/blob/6ff8505ec80fc4b04d668f65d229f4f58bc449e0/fastchat/conversation.py#L365 is included in every prompt. |
Could you provide a more detailed error report you encountered when you set use-pp to 0? It may be a version mismatch. You can use the latest code and install the latest dependencies. I guess the llama2-7b-chat format is not compatible with the huggingface format. Because we use transformers lib, a model weight compatible with transformers lib (i.e., llama2-7b-chat-hf) is needed. |
Yeah, using llama2-7b-chat-hf should be correct. llama2-7b-chat model is not compatible with transformers. I guess this is the problem. |
Will you consider a version of eval_mtbench for llama2-7b-chat? |
Currently, I am not considering supporting llama2-7b-chat. From its website, we can find that we need to use https://github.com/facebookresearch/llama to support this model weights. While I plan to minimize the maintenance efforts to support most models and, supporting huggingface's transformers is the simplest way. |
Can I get a MT-bench evaluation code for reproduction of acceleration?
The text was updated successfully, but these errors were encountered: