Notebook to create backgammon evaluation data for openai/evals

It turns out gpt3.5 nor gpt4 are particularly good at playing backgammon - well in fact they are terrible. This notebook creates evaluation data using the nice backgammon library from https://github.com/softwerks/backgammon.

There are basically 2 main questions gpt needs to answer, given a specific backgammon game state and a dice roll:

can the opponent be hit with the provided dice roll?
is a the provided play an illegal play?

For both cases, "easy" board states are constructed and eventually converted into a prompt and wrapped as an evaluation compotible with https://github.com/openai/evals.

can hit?

Example for a board state where the answer to "can backgammonGPT hit" is True.

                 Position ID: aOfgBwA2ZvABMA
                 Match ID   : cIgRAAAAAAAA
 +13-14-15-16-17-18------19-20-21-22-23-24-+
 | X           O    |   | O  O  O        X |
 | X           O    |   | O  O           X |
 | X           O    |   | O                |
 | X                |   |                  |
 | X                |   |                  |
v|                  |BAR|                  |
 | 6                |   |                  |
 | O                |   |                  |
 | O                |   |                  |
 | O           X    |   | X        X  X    |
 | O           X    |   | X        X  X    |
 +12-11-10--9--8--7-------6--5--4--3--2--1-+

dice: (3, 4)

these plays hit an opponents checker:
24/21 6/2
24/21 8/4
24/21 13/9

is illegal move?

Example for a board state where the answer to is "24/19 8/5" illegal is also True.

                 Position ID: mOfEATDI58EBMA
                 Match ID   : cIgOAAAAAAAA
 +13-14-15-16-17-18------19-20-21-22-23-24-+
 | X     O     O    |   | O     O        X |
 | X           O    |   | O     O        X |
 | X           O    |   | O                |
 |                  |   | O                |
 |                  |   |                  |
v|                  |BAR|                  |
 |                  |   | X                |
 |             X    |   | X                |
 | O           X    |   | X                |
 | O           X    |   | X              O |
 | O           X    |   | X     X        O |
 +12-11-10--9--8--7-------6--5--4--3--2--1-+

dice: (5, 3)

this ia an illegal move:
24/19 8/5

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_openai_eval_samples.ipynb		generate_openai_eval_samples.ipynb
generate_openai_eval_samples_export.ipynb		generate_openai_eval_samples_export.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notebook to create backgammon evaluation data for openai/evals

can hit?

is illegal move?

About

Releases

Packages

Languages

License

bakebrain/evals_backgammon

Folders and files

Latest commit

History

Repository files navigation

Notebook to create backgammon evaluation data for openai/evals

can hit?

is illegal move?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages