Name		Name	Last commit message	Last commit date
parent directory ..
scores		scores
README.md		README.md
config.json		config.json
eval-example-data.json		eval-example-data.json
llm-instruction-eval-ollama.ipynb		llm-instruction-eval-ollama.ipynb
llm-instruction-eval-openai.ipynb		llm-instruction-eval-openai.ipynb
requirements-extra.txt		requirements-extra.txt

README.md

Chapter 7: Instruction Finetuning

This folder contains utility code that can be used for model evaluation.

Evaluating Instruction Responses Using the OpenAI API

The llm-instruction-eval-openai.ipynb notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

{
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
    "model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
},

Evaluating Instruction Responses Locally Using Ollama

The llm-instruction-eval-ollama.ipynb notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

03_model-evaluation

03_model-evaluation

README.md

Chapter 7: Instruction Finetuning

Evaluating Instruction Responses Using the OpenAI API

Evaluating Instruction Responses Locally Using Ollama

Files

03_model-evaluation

Directory actions

More options

Directory actions

More options

Latest commit

History

03_model-evaluation

Folders and files

parent directory

README.md

Chapter 7: Instruction Finetuning

Evaluating Instruction Responses Using the OpenAI API

Evaluating Instruction Responses Locally Using Ollama