Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jeopardy Example Script #1168

Merged
merged 11 commits into from
Apr 28, 2023
Merged

Jeopardy Example Script #1168

merged 11 commits into from
Apr 28, 2023

Conversation

CRD716
Copy link
Contributor

@CRD716 CRD716 commented Apr 25, 2023

Closes #1163

This is pretty much just a straight port of aigoopy/llm-jeopardy/
Leaving as a draft since it's still missing a lot of features, and I will continue to work on it to make it more usable.

@CRD716
Copy link
Contributor Author

CRD716 commented Apr 27, 2023

All that's left is the readme.

@CRD716 CRD716 marked this pull request as ready for review April 27, 2023 05:52
@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 27, 2023

I don't think this is correct:

1,The Oscars,Who is John Williams?,Which actor Born in 1932 was the son of a percussionist in the CBS radio orchestra has been nominated for 53 Oscars?

The question should be in the answer format, like so:

1,The Oscars,Who is John Williams?,Born in 1932 & the son of a percussionist in the CBS radio orchestra, he's been nominated for 53 Oscars

@CRD716
Copy link
Contributor Author

CRD716 commented Apr 27, 2023

I don't think this is correct:


1,The Oscars,Who is John Williams?,Which actor Born in 1932 was the son of a percussionist in the CBS radio orchestra has been nominated for 53 Oscars?

The question should be in the answer format, like so:


1,The Oscars,Who is John Williams?,Born in 1932 & the son of a percussionist in the CBS radio orchestra, he's been nominated for 53 Oscars

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 27, 2023

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

Then the answer should be "John Williams" not "Who is John Williams?"

@CRD716
Copy link
Contributor Author

CRD716 commented Apr 27, 2023

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

Then the answer should be "John Williams" not "Who is John Williams?"

See https://github.com/aigoopy/llm-jeopardy/blob/main/qasheet.ods and #1163, I would assume we're trying to use the same data as everyone else, so I'm not sure if this issue is supposed to be an implementation of aigoopy's jeopardy or just something with a similar style. @ggerganov which would you prefer?

@SlyEcho
Copy link
Collaborator

SlyEcho commented Apr 27, 2023

The columns "Original Answer" and "Original Correct Question" in the spreadsheet is the data they used (what is the source? maybe https://j-archive.com/). Then they created "Model Prompt" where it has been turned into a question, and for all the models, they are also answering in an answer format, explained in Reddit.

But anyway, I think this test should be either question-answer or jeopardy style answer-question, but not a mix.

If we don't change the data from the original, we could possibly evaluate a much larger dataset without having to manually edit questions.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @SlyEcho - we should keep either format and not mix it like they've done in the reference repo.

For now, we can merge it like this so we have the evaluation framework available, and later we can update the questions / answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

llama.cpp + Final Jeopardy
3 participants