We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llamafactory
如下两种方式对同一批数据打分结果不一致: 方式1: 本地部署一个训练过的reward model API_PORT=8001 llamafactory-cli api --model_name_or_path xxx --template qwen --stage rm
通过如下方式获取score
prompt = "You are a helpful assistant." messages = [ {"role": "system", "content": prompt}, {"role": "user", "content": instruct}, {"role": "assistant", "content": output} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) return text def get_score(instruct, output): text = make_text(instruct, output) data = { "model": "qwen2.5_3B_style_rm_3k", "messages": [ text ] } r = requests.post("http://127.0.0.1:8001/v1/score/evaluation", data=json.dumps(data)) return json.loads(r.text)["scores"][0]``` 方式2: llamafactory-cli train xxx.yaml yaml内容
model_name_or_path: xxx
stage: rm do_train: false do_eval: false do_predict: true
eval_dataset: xxx template: qwen cutoff_len: 1024 max_samples: 10000 overwrite_cache: true preprocessing_num_workers: 16
output_dir: xxx
per_device_eval_batch_size: 1
### Expected behavior 方式1给出的score比较低,且chosen > reject 的比例只有60% 方式2 给出score 数值较高,且chosen > reject 的比例有100% 想知道是我部署出了问题,还是评测出了问题 ### Others _No response_
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
System Info
llamafactory
version: 0.8.4.dev0Reproduction
如下两种方式对同一批数据打分结果不一致:
方式1:
本地部署一个训练过的reward model
API_PORT=8001 llamafactory-cli api --model_name_or_path xxx --template qwen --stage rm
通过如下方式获取score
model_name_or_path: xxx
stage: rm
do_train: false
do_eval: false
do_predict: true
eval_dataset: xxx
template: qwen
cutoff_len: 1024
max_samples: 10000
overwrite_cache: true
preprocessing_num_workers: 16
output_dir: xxx
per_device_eval_batch_size: 1
The text was updated successfully, but these errors were encountered: