Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce the performance on the llamaQuestions dataset as reported in the paper. #159

Open
1 task done
UltraEval opened this issue Nov 22, 2024 · 0 comments
Open
1 task done
Labels
question Further information is requested

Comments

@UltraEval
Copy link

Due diligence

  • I have done my due diligence in trying to find the answer myself.

Topic

The paper

Question

Unable to Reproduce Moshi's Performance on llamaQuestions Dataset

I attempted to replicate Moshi's performance on the llamaQuestions dataset as reported in the paper, but achieved only 13%.

Testing Methodology:

I used the inference method detailed at:
https://github.com/kyutai-labs/moshi/blob/main/moshi/README.md#api

Request for Clarification:

Could you please release the test script for LlamaQuestions?

Observation:

If the script used is https://github.com/kyutai-labs/moshi/blob/main/scripts/moshi_benchmark.py, it does improve performance to 55%. However, it seems unusual to perform inference 10 times for same question.

Attached is my testing process file.

image

moshi_llamaQ.zip

@UltraEval UltraEval added the question Further information is requested label Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant