The critic model will generate different type of token when I use run_reward_vllm.py to generate tokens #75

Teng0828 · 2024-05-07T19:22:35Z

I want to create my own training data, and I follow the step of creating generator training data. But when I tried to use the critic model to generate the utility (isUse) token, some preds are wrong as shown in the picture.

some of the preds are "retrieval" rather than "utility". I use the exactly same command as in readme file.

fate-ubw · 2024-06-18T12:59:08Z

Have you figure out this question? I met the same question

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The critic model will generate different type of token when I use run_reward_vllm.py to generate tokens #75

The critic model will generate different type of token when I use run_reward_vllm.py to generate tokens #75

Teng0828 commented May 7, 2024

fate-ubw commented Jun 18, 2024

The critic model will generate different type of token when I use run_reward_vllm.py to generate tokens #75

The critic model will generate different type of token when I use run_reward_vllm.py to generate tokens #75

Comments

Teng0828 commented May 7, 2024

fate-ubw commented Jun 18, 2024