Another book with unbalanced human openings #39

vondele · 2023-10-21T15:35:19Z

A new book derived from Lichess games, with a model draw rate between 48% and 52%

It attempts to address the following points, relative to the currently used book:

about 10x larger (2.6M pos), i.e. more variety while testing on fishtest, no repeated openings for any single test played.
both white and black advantage around +- 1.0
positions at all game plies between 1 and 16

The construction process involved

Parsing all 15B lichess games in the database https://database.lichess.org/ for the period Jan - Sept 2023.
Extract from these the popular positions, i.e. seen at least twice, within the first 16 plies played, exploring newly added games to at most 8 previously unseen plies.

$ ./fastpopular --dir /mnt/md0/chess/lichessgames/2023/ --minCount 2 --stopEarly --countStopEarly 8 --maxPlies 16 --concurrency 9 -o popular_Lichess_JanSept_maxPlies16_stopEarly8.epd
Looking for pgn files in /mnt/md0/chess/lichessgames/2023/
Found 9 .pgn(.gz) files, creating 9 chunks for processing.
Processed 9 files
Retained 296993424 positions from 1127228493 unique visited in 15251265926 games.
Total time for processing: 7374.5 s

fastpopular as available at https://github.com/vondele/fastpopular

Score all these 296M games with a modified stockfish, based on master, that analyses positions up to a depth 24, for as long as the draw rate is predicted (UCI_ShowWDL) near 50%.
Positions will be analysed to low depth if the draw rate is very different from 50% at low depth.
From these scored positions, extract those with a draw rate in the range 48 - 52%
That modified branch is available at https://github.com/vondele/Stockfish/tree/createUHO

   ./stockfish.createUHO bench 128 1 24 popular_Lichess_JanSept_maxPlies16_stopEarly8.epd > popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd
   awk '{if ($15>480 && $15<520) print $0}' popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd | cut -d';' -f1 | sed "s/ $//g" > UHO_Lichess_4852_v1.epd

Short initial testing at STC shows the draw rate is, as expected, close to 50% for self-play games:

Score of master1 vs master2: 1048 - 1031 - 1921 [] 4000
Elo difference: 1.48 +/- 7.75, LOS: 64.54 %, DrawRatio: 48.02 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        21    473   1026    462     18

A new book derived from Lichess games, with a model draw rate between 48% and 52% It attempts to address the following points, relative to the currently used book: * about 10x larger (2.6M pos), i.e. more variety while testing on fishtest, no repeated openings for any single test played. * both white and black advantage around +- 1.0 * positions at all game plies between 1 and 16 The construction process involved 1) Parsing all 15B lichess games in the database https://database.lichess.org/ for the period Jan - Sept 2023. Extract from these the popular positions, i.e. seen at least twice, within the first 16 plies played, exploring newly added games to at most 8 previously unseen plies. ``` $ ./fastpopular --dir /mnt/md0/chess/lichessgames/2023/ --minCount 2 --stopEarly --countStopEarly 8 --maxPlies 16 --concurrency 9 -o popular_Lichess_JanSept_maxPlies16_stopEarly8.epd Looking for pgn files in /mnt/md0/chess/lichessgames/2023/ Found 9 .pgn(.gz) files, creating 9 chunks for processing. Processed 9 files Retained 296993424 positions from 1127228493 unique visited in 15251265926 games. Total time for processing: 7374.5 s ``` fastpopular as available at https://github.com/vondele/fastpopular 2) Score all these 296M games with a modified stockfish, based on master, that analyses positions up to a depth 24, for as long as the draw rate is predicted (UCI_ShowWDL) near 50%. Positions will be analysed to low depth if the draw rate is very different from 50% at low depth. From these scored positions, extract those with a draw rate in the range 48 - 52% That modified branch is available at https://github.com/vondele/Stockfish/tree/createUHO ``` ./stockfish.createUHO bench 128 1 24 popular_Lichess_JanSept_maxPlies16_stopEarly8.epd > popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd awk '{if ($15>480 && $15<520) print $0}' popular_Lichess_JanSept_maxPlies16_stopEarly8_scored.epd | cut -d';' -f1 | sed "s/ $//g" > UHO_Lichess_4852_v1.epd ``` Short initial testing at STC shows the draw rate is, as expected, close to 50% for self-play games: ``` Score of master1 vs master2: 1048 - 1031 - 1921 [] 4000 Elo difference: 1.48 +/- 7.75, LOS: 64.54 %, DrawRatio: 48.02 % Ptnml: WW WD DD/WL LD LL Distr: 21 473 1026 462 18 ```

robertnurnberg · 2024-03-01T07:15:34Z

positions at all game plies between 1 and 16

Just a tiny correction. The earliest game ply I could find is 2, e.g. for the position rnbqkbnr/p1pppppp/8/1p6/3P4/8/PPP1PPPP/RNBQKBNR w KQkq - 0 2.

Edit: Here the complete list of frequencies.

game ply  2: 5 times
game ply  3: 47 times
game ply  4: 642 times
game ply  5: 3454 times
game ply  6: 12996 times
game ply  7: 29984 times
game ply  8: 60510 times
game ply  9: 99575 times
game ply 10: 156793 times
game ply 11: 217136 times
game ply 12: 288550 times
game ply 13: 353868 times
game ply 14: 420058 times
game ply 15: 470702 times
game ply 16: 517716 times

vondele merged commit 426eca4 into official-stockfish:master Oct 21, 2023

robertnurnberg mentioned this pull request Mar 15, 2024

refinement of UHO Lichess book #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another book with unbalanced human openings #39

Another book with unbalanced human openings #39

vondele commented Oct 21, 2023

robertnurnberg commented Mar 1, 2024 •

edited

Loading

Another book with unbalanced human openings #39

Another book with unbalanced human openings #39

Conversation

vondele commented Oct 21, 2023

robertnurnberg commented Mar 1, 2024 • edited Loading

robertnurnberg commented Mar 1, 2024 •

edited

Loading