Skip to content
Alex Greason edited this page May 17, 2018 · 62 revisions

Getting Started

How can I contribute to the project?

See Getting Started and follow the instructions for running self-play training games.

How do I just run the engine?

See Getting Started and follow the instructions for running the engine. See also Running Leela Chess Zero in a Chess GUI for instruction on various popular GUIs.


Current strength of Leela Chess Zero

The Elo chart seems inflated.

  1. The chart is not calibrated to CCRL or any other common list. Instead it sets random play = 0 Elo.
  2. The different points are calculated from self-play matches. Self-play tends to exaggerate gains in Elo compared to gains when playing other chess engines.

Where can I find Leela Chess Zero's current Elo?

Many people are keeping their own rating lists, here are some examples:

Where can I track LCZero's progress on tactics?

Here are links to various tactical test positions people are tracking:

What other metrics are being tracked?

A list is being collected of all Leela-tracking spreadsheets:

Can I watch Leela Chess Zero play somewhere?

  • Several people run Lc0 on lichess:
  • Some people stream test matches against other engines or itself frequently, notable streams include:
  • See competition games against other engines -- LC0 played against other strong engines in TCEC Season 12, Division 4. Future seasons will likely feature Leela again, and can be viewed live at the TCEC website and on Twitch.
  • See recent test match games - Click on the first row, first column, then pick a game. These games are played between recent versions of the engine to measure progress. They are blitz games played with 800 playouts (around 1 second) per move.
  • See recent self-play training games - Scroll to "Active Users", pick someone, then pick a game. These games are how Leela Chess Zero trains herself. They are played with extra randomness turned on so it can discover new good (and bad) moves. This means the quality of these games is lower than the match games.

Where can I find PGNs of played games?

Provided by:     Edosan
LC0 network ID:  240
Opponent:        Scorpio 2.8

Download link: http://s000.tinyupload.com/index.php?file_id=71717869477580896599

Provided by:     y_Sensei
LC0 version:     0.1-current version, TF version, cuDNN version
LC0 network ID:  13-current version
Opponent:        Stockfish 8 + 9, OpenTal 1.1, Rodent III 0.172

Download link: http://bit.ly/ys-chess

Provided by:     Edosan
LC0 network ID:  247 GPU v8 W/ TB
Opponent:        Houdini 6.03

Download link:https://lichess.org/BHgy4azy


Leela Chess Zero methods and input/output terms

Basics

Like all other Chess (or Go) engines, Leela maintains a tree of potential future moves and game states. Each potential game state is called a node in the tree, and each node has a list of child nodes corresponding to the potential moves (edges) for that board position. Nodes are first created with an estimated win value for that position, as well as a list of potential continuing moves to consider (called the policy for that position). Traditional chess engines have a very-finely-crafted-by-humans win valuation and policy generation system; unlike traditional engines, Leela uses its neural network trained without human knowledge for both win valuation and policy generation. Then, by some means or another, the engines expand the tree to get a better understanding of the root node, the current on-the-board position.

Leela's search methods

In the tree search algorithm used by Leela, we evaluate new nodes by doing what's called a playout: start from the root node (the current, actually-on-the-board position), pick a move to explore, and repeat down the tree until we reach a game position that has not been examined yet or a position that ends the game. If possible, we expand the tree with that new position and use the neural network to create a first estimate of the win value for the position and the probabilities for continuing moves. In Leela, a policy for a node is a list of possible moves and a probability for each move. The probability specifies the probability that an automatic player that executes the policy will make that move. When one playout has been completed, all the visited nodes, from the root down to the last node, are updated with new win values and visit counts. One playout adds at the most one new node to the tree, and all nodes already in the tree were first created by a previous playout.

When a move is actually played on the board, the chosen move is made the new root of the tree. The old root and the other daughters of that root node are erased.

This is the same search specified by the AGZ paper, PUCT (Predictor + Upper Confidence Bound tree search). Many people call this MCTS (Monte-Carlo Tree Search), because it is very similar to the search algorithm the Go programs started using in 2006. But the PUCT used in AGZ and Lc0 does not do game rollouts (sampling playouts to a terminal game state). Other search algorithms are under consideration on the Github of Leela Go, but there isn't yet any real consensus that something else is demonstrably better than PUCT. This is something of an active research topic in the overlap of the AI+Game Theory fields.

What are the meaning of the terms used in debug output?

  • Nodes: A potential game position in the tree of future gameplay. The root node of the tree is the current position.
  • Playouts: As defined by the Monte Carlo Tree Search, starting from the root node: 1) pick a move to explore according to the policy and confidence of the choices (PUCT selection algorithm); 2) travel to the resulting game position node; 3) if this child node is already explored at least once, repeat steps 1) and 2), otherwise; 4) evaluate this node via the neural network, creating value and policy estimates for this position, and use this new value estimate to update all parent nodes' values. After 4), the playout is complete.
  • Visits: Total number of playouts that have traversed this node. This is equal to or slightly greater than the total size of the tree below this node.
  • N: Neural network's original, raw policy output (probability this is the best move)
  • V: Average expected value of all playouts for this move (not to be confused with the neural network's original estimate of the probability of making this move)
  • PV: Principal Variation. The moves that would be taken if we on each level chose the node that has most visits (by playouts).
  • Depth: An approximation log(float(m_nodes)) / log(1.8) (improvements TBD)
  • NPS: Nodes per second, including NNCache hits, but excluding nodes carried over from tree reuse. In other words, total playouts generated per second.
  • Score cp: formulaically converted from the average expected win value of all searches from the root (see below).
  • Time: Thinking time in milliseconds
  • Bestmove: Picks the move with the most visits. randomize or tempdecay options allow picking lower moves for variety.

Example debug output

              Move     Visits   Avg_Value    Policy     Principal Variation
info string    f3 ->      23   (V: 46.59%) (N:  0.63%) PV: f3 e5 c4 Nf6 Nc3 c6 d4 exd4
info string    g4 ->      32   (V: 47.29%) (N:  0.74%) PV: g4 d5 h3 e5 d3 h5 g5 Nc6 Bg2 Be6
info string   Na3 ->      51   (V: 49.10%) (N:  0.88%) PV: Na3 e5 e4 Nf6 Nc4 Nxe4 Nxe5 d6 Nef3
info string    f4 ->      55   (V: 49.12%) (N:  0.85%) PV: f4 Nf6 d3 d5 Nf3 c5 Nc3 Nc6
info string   Nh3 ->      60   (V: 49.97%) (N:  0.75%) PV: Nh3 e5 e3 d5 d4 exd4 exd4 Nf6 Nf4 Bd6 Nc3
info string    h4 ->      78   (V: 50.00%) (N:  0.98%) PV: h4 e5 e4 Nf6 Nc3 d5 exd5 Nxd5 Nf3 Nc6 Bb5 Nxc3
info string    b3 ->     153   (V: 50.97%) (N:  1.27%) PV: b3 e5 Bb2 Nc6 Nf3 e4 Nd4 Nxd4 Bxd4 d5 c4 dxc4 e3 cxb3
info string    g3 ->     160   (V: 50.75%) (N:  1.48%) PV: g3 e5 e4 Nf6 Nc3 d5 exd5 Nxd5 Bg2 Nxc3 bxc3 Nc6
info string    a4 ->     169   (V: 50.84%) (N:  1.56%) PV: a4 e5 e4 Nf6 Nc3 Bb4 Nf3 Bxc3 dxc3 Nc6 Bd3
info string    c3 ->     176   (V: 50.53%) (N:  1.78%) PV: c3 e5 d4 e4 c4 c6 Bf4 d5 e3 Nf6 Nc3 Be7
info string    h3 ->     308   (V: 51.54%) (N:  1.87%) PV: h3 e5 e3 d5 d4 e4 c4 c6 cxd5 cxd5 Nc3 Nf6 Nge2 Bd6
info string    d3 ->     329   (V: 51.38%) (N:  2.22%) PV: d3 d5 Nf3 Nf6 Nc3 c5 e4 d4 Nb1 Nc6
info string   Nc3 ->     518   (V: 51.60%) (N:  3.10%) PV: Nc3 d5 d4 Nf6 Bg5 e6 Bxf6 Qxf6 e4 Bb4 exd5 exd5
info string    a3 ->    1957   (V: 52.84%) (N:  2.37%) PV: a3 e5 c4 c6 e3 d5 d4 exd4 exd4 Nf6 Nf3 Bd6 Bd3 dxc4 Bxc4 O-O
info string    d4 ->    2332   (V: 52.39%) (N:  6.33%) PV: d4 Nf6 c4 e6 Nc3 d5 e3 c5 cxd5 exd5 Nf3 cxd4 exd4 Bd6 Bd3
info string    b4 ->    2680   (V: 52.88%) (N:  1.81%) PV: b4 e5 a3 d5 Bb2 Bd6 Nf3 Nd7 e3 Ngf6 c4 dxc4 Bxc4 e4 Ng5 Ne5
info string   Nf3 ->    2927   (V: 52.35%) (N:  8.27%) PV: Nf3 Nf6 c4 e6 Nc3 d5 cxd5 exd5 d4 Bd6 Bg5 c6 Bxf6 Qxf6 e4
info string    c4 ->    4216   (V: 52.59%) (N:  8.53%) PV: c4 e5 e3 Nf6 Nc3 Bb4 Nd5 Nxd5 cxd5 O-O a3 Be7 d4 exd4 exd4 c6 Bd3 cxd5 Nf3
info string    e3 ->    7130   (V: 52.81%) (N:  6.47%) PV: e3 Nf6 d4 e6 c4 d5 Nf3 c5 cxd5 exd5 Bb5+ Nc6 Ne5 Qc7 Qa4 cxd4 exd4
info string    e4 ->   65436   (V: 52.95%) (N: 48.08%) PV: e4 e6 d4 d5 e5 c5 c3 Qa5 a3 cxd4 b4 Qc7 cxd4 f6 Nf3 fxe5 dxe5 Nc6 Bb2 Nh6 Bd3 Nf7 O-O

(Note the list is in reverse visits order since Arena GUI flips it again.)

Final move information

info depth 22 nodes 17749 nps 4205 score cp -15 time 4220 pv e6 d4 d5 e5 c5 c3 Qa5 dxc5 Qc7 Nf3 Bxc5 b4 Bb6 a4 a6 a5 Ba7
bestmove e7e6

How does Leela Chess Zero calculate the cp eval?

Leela Chess Zero uses a winrate score in the range [0,1]. Winrate is actually an expected score, including draws. This expected score is converted to a traditional centi-pawn (cp) eval. Release v0.7 uses the formula: cp = 290.680623072 * tan(3.096181612 * (expected_score - 0.5)). Future releases will tune this formula to match traditional cp evals.