-
Notifications
You must be signed in to change notification settings - Fork 299
See Getting Started and follow the instructions for running self-play training games.
See Getting Started and follow the instructions for running the engine. See also Running Leela Chess Zero in a Chess GUI for instruction on various popular GUIs.
The Elo chart seems inflated.
- The chart is not calibrated to CCRL or any other common list. Instead it sets random play = 0 Elo.
- The different points are calculated from self-play matches. Self-play tends to exaggerate gains in Elo compared to gains when playing other chess engines.
Many people are keeping their own rating lists, here are some examples:
- LCZ vs Stockfish
- LCZ CCRL Estimate
- CCLS Rating for LCZ from these gauntlet results
- LCZ Basic Checkmates
- LCZ vs SF Time Handicap
Here are links to various tactical test positions people are tracking:
- Several people run Lc0 on lichess:
- Some people stream test matches against other engines or itself frequently, notable streams include:
- See competition games against other engines -- LC0 played against other strong engines in TCEC Season 12, Division 4. Future seasons will likely feature Leela again, and can be viewed live at the TCEC website and on Twitch.
- See recent test match games - Click on the first row, first column, then pick a game. These games are played between recent versions of the engine to measure progress. They are blitz games played with 800 playouts (around 1 second) per move.
- See recent self-play training games - Scroll to "Active Users", pick someone, then pick a game. These games are how Leela Chess Zero trains herself. They are played with extra randomness turned on so it can discover new good (and bad) moves. This means the quality of these games is lower than the match games.
Links to PGNs can be placed here by editing this Wiki and linking them below.
Opponent Scorpio 2.8
Time Control Various
ID 240
User Edosani
Link http://s000.tinyupload.com/index.php?file_id=71717869477580896599
Like all other Chess (or Go) engines, Leela maintains a tree of potential future moves and game states. Each potential game state is called a node in the tree, and each node has a list of child nodes corresponding to the potential moves for that board position. Nodes are first created with an estimated win value for that position, as well as a list of potential continuing moves to consider (called the policy for that position). Traditional chess engines have a very finely crafted-by-humans win valuation and policy generation system; unlike traditional engines, Leela uses its neural network trained without human knowledge for both win valuation and policy generation. Then, by some means or another, the engines expand the tree to get a better understanding of the root node, the current on-the-board position.
The tree search algorithm used by Leela (see below), we evaluate new nodes, by doing what's called a playout: start from the root node (the current, actually-on-the-board position), pick the most promising move according to neural net's policy, and repeat all the way down the tree until you reach a board state that hasn't been examined yet; then you expand that new node by using the neural network to create a first estimate of its win value (and policy of potential continuing moves), and use that new information to update the root node all the way back at the top of the tree. So, for Leela, one playout adds exactly one new node to the tree, and all nodes already in the tree were first created by some previous playout. One playout, one new node*. When a move is actually played on the board, the child nodes corresponding to alternative moves are erased from the tree, and the child node of the move played becomes the new root node.
This is the same search specified by the AGZ paper, PUCT (Predictor + Upper Confidence Bound tree search). Many people call this MCTS (Monte-Carlo Tree Search), because it is very similar to the search algorithm the Go programs started using in 2006. But the PUCT used in AGZ and Lc0 does not do game rollouts (sampling playouts to a terminal game state). Other search algorithms are under consideration on the Githubs of both Leela Go (#860, #883) and Leela Chess, but no real consensus as yet that something else is demonstrably better than PUCT. This is something of an active research topic in the overlap of the AI+Game Theory fields.
* (Technical exception: a terminal node (checkmate or various draws) can't be expanded, so when a playout hits a terminal node, the tree doesn't grow, however "playouts" and "visits" are incremented as if the tree had grown anyways.)
- Visits: Total number of playouts that have traversed this node. Since one playout = one new node, this is equivalent to the total size of the tree below this node
- Playouts: In the debug output and engine options, "playouts" means the number of playouts that started from this node, which excludes playouts that started from higher nodes (previous moves) that happened to traverse this node. This meaning is an unfortunate overload of the word "playout", but is already thoroughly embedded into the codebase and documentation. (Another way to think about it: "visits" includes the nodes reused from prior moves, while "playouts" only counts the nodes added during the current move)
- V: Average expected value of all searches for this move (not to be confused with the neural network's original estimate of the value of this move)
- N: Neural network's original, raw policy output (probability this is the best move)
- PV: Principal Variation
-
Depth: An approximation
log(float(m_nodes)) / log(1.8)
(improvements TBD) - Nodes: Nodes with at least 1 visit (Note the actual tree m_nodes is one level deeper than you would expect, because m_nodes also includes children of nodes with only 1 visit.)
- NPS: Nodes per second, including NNCache hits, but excluding nodes carried over from tree reuse. In other words, total playouts generated during this move divided by time
- Score cp: formulaically converted from the average expected win value of all searches from the root (see below).
- Time: Thinking time in milliseconds
-
Bestmove: Picks the move with the most visits.
randomize
ortempdecay
options allow picking lower moves for variety.
Example debug output
Move Visits Avg_Value Policy Principal Variation
info string f3 -> 23 (V: 46.59%) (N: 0.63%) PV: f3 e5 c4 Nf6 Nc3 c6 d4 exd4
info string g4 -> 32 (V: 47.29%) (N: 0.74%) PV: g4 d5 h3 e5 d3 h5 g5 Nc6 Bg2 Be6
info string Na3 -> 51 (V: 49.10%) (N: 0.88%) PV: Na3 e5 e4 Nf6 Nc4 Nxe4 Nxe5 d6 Nef3
info string f4 -> 55 (V: 49.12%) (N: 0.85%) PV: f4 Nf6 d3 d5 Nf3 c5 Nc3 Nc6
info string Nh3 -> 60 (V: 49.97%) (N: 0.75%) PV: Nh3 e5 e3 d5 d4 exd4 exd4 Nf6 Nf4 Bd6 Nc3
info string h4 -> 78 (V: 50.00%) (N: 0.98%) PV: h4 e5 e4 Nf6 Nc3 d5 exd5 Nxd5 Nf3 Nc6 Bb5 Nxc3
info string b3 -> 153 (V: 50.97%) (N: 1.27%) PV: b3 e5 Bb2 Nc6 Nf3 e4 Nd4 Nxd4 Bxd4 d5 c4 dxc4 e3 cxb3
info string g3 -> 160 (V: 50.75%) (N: 1.48%) PV: g3 e5 e4 Nf6 Nc3 d5 exd5 Nxd5 Bg2 Nxc3 bxc3 Nc6
info string a4 -> 169 (V: 50.84%) (N: 1.56%) PV: a4 e5 e4 Nf6 Nc3 Bb4 Nf3 Bxc3 dxc3 Nc6 Bd3
info string c3 -> 176 (V: 50.53%) (N: 1.78%) PV: c3 e5 d4 e4 c4 c6 Bf4 d5 e3 Nf6 Nc3 Be7
info string h3 -> 308 (V: 51.54%) (N: 1.87%) PV: h3 e5 e3 d5 d4 e4 c4 c6 cxd5 cxd5 Nc3 Nf6 Nge2 Bd6
info string d3 -> 329 (V: 51.38%) (N: 2.22%) PV: d3 d5 Nf3 Nf6 Nc3 c5 e4 d4 Nb1 Nc6
info string Nc3 -> 518 (V: 51.60%) (N: 3.10%) PV: Nc3 d5 d4 Nf6 Bg5 e6 Bxf6 Qxf6 e4 Bb4 exd5 exd5
info string a3 -> 1957 (V: 52.84%) (N: 2.37%) PV: a3 e5 c4 c6 e3 d5 d4 exd4 exd4 Nf6 Nf3 Bd6 Bd3 dxc4 Bxc4 O-O
info string d4 -> 2332 (V: 52.39%) (N: 6.33%) PV: d4 Nf6 c4 e6 Nc3 d5 e3 c5 cxd5 exd5 Nf3 cxd4 exd4 Bd6 Bd3
info string b4 -> 2680 (V: 52.88%) (N: 1.81%) PV: b4 e5 a3 d5 Bb2 Bd6 Nf3 Nd7 e3 Ngf6 c4 dxc4 Bxc4 e4 Ng5 Ne5
info string Nf3 -> 2927 (V: 52.35%) (N: 8.27%) PV: Nf3 Nf6 c4 e6 Nc3 d5 cxd5 exd5 d4 Bd6 Bg5 c6 Bxf6 Qxf6 e4
info string c4 -> 4216 (V: 52.59%) (N: 8.53%) PV: c4 e5 e3 Nf6 Nc3 Bb4 Nd5 Nxd5 cxd5 O-O a3 Be7 d4 exd4 exd4 c6 Bd3 cxd5 Nf3
info string e3 -> 7130 (V: 52.81%) (N: 6.47%) PV: e3 Nf6 d4 e6 c4 d5 Nf3 c5 cxd5 exd5 Bb5+ Nc6 Ne5 Qc7 Qa4 cxd4 exd4
info string e4 -> 65436 (V: 52.95%) (N: 48.08%) PV: e4 e6 d4 d5 e5 c5 c3 Qa5 a3 cxd4 b4 Qc7 cxd4 f6 Nf3 fxe5 dxe5 Nc6 Bb2 Nh6 Bd3 Nf7 O-O
(Note the list is in reverse visits order since Arena GUI flips it again.)
Final move information
info depth 22 nodes 17749 nps 4205 score cp -15 time 4220 pv e6 d4 d5 e5 c5 c3 Qa5 dxc5 Qc7 Nf3 Bxc5 b4 Bb6 a4 a6 a5 Ba7
bestmove e7e6
Leela Chess Zero uses a winrate score in the range [0,1]. Winrate is actually an expected score, including draws. This expected score is converted to a traditional centi-pawn (cp) eval. Release v0.7 uses the formula: cp = 290.680623072 * tan(3.096181612 * (expected_score - 0.5))
. Future releases will tune this formula to match traditional cp evals.