Any plans on running evals for codellama? #11

ErikBjare · 2023-08-25T08:34:10Z

I'm keeping https://github.com/ErikBjare/are-copilots-local-yet up-to-date, and would love to see some codellama numbers given it's now SOTA :)

nicoladainese96 · 2023-09-14T12:37:10Z

I would be interested in this as well. I had some attempts on my own for the Python-7B and Instruct-7B models but if I use the same code of Llama-2 the performance is horrible (e.g., 3 and 8% respectively). As a comparison, with the same exact code, Llama-2-chat-7b gives me 11%.

smart-lty · 2024-01-09T03:52:23Z

I would be interested in this as well. I had some attempts on my own for the Python-7B and Instruct-7B models but if I use the same code of Llama-2 the performance is horrible (e.g., 3 and 8% respectively). As a comparison, with the same exact code, Llama-2-chat-7b gives me 11%.

I meet the same situation. Even if I try to use instructions in "core/prompts.py", the performance for codellama-7b is 22.8% for pass@1, still lower than the reported number in official document by a large margin. Have you fixed this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any plans on running evals for codellama? #11

Any plans on running evals for codellama? #11

ErikBjare commented Aug 25, 2023

nicoladainese96 commented Sep 14, 2023

smart-lty commented Jan 9, 2024

Any plans on running evals for codellama? #11

Any plans on running evals for codellama? #11

Comments

ErikBjare commented Aug 25, 2023

nicoladainese96 commented Sep 14, 2023

smart-lty commented Jan 9, 2024