Skip to content
Johannes Czech edited this page Apr 4, 2021 · 15 revisions

Strength Evaluation v0.3.1

CrazyAra 0.3.1 played multiple world champion Justin Tan (LM JannLee) at 18:00 GMT on 21st December in five official matches and won 4-1. You can find a detailed report about the past event published by okei here:

Hardware Setup

  • Memory (RAM): 31,4 GiB
  • Processor - AMD® Ryzen 7 1700 eight-core processor × 16
  • Graphics - GeForce GTX 1080 Ti/PCIe/SSE2
  • OS type - 64-bit

Configuration

It was tried to give all engines the best possible settings for achieving the greatest strength on this hardware. If there was an option of setting a hash size it was set to 4096mb. (Temporary) position learning and opening books have been enabled for TjChess 1.3. If there was an option of the number of threads which was the case for Stockfish and CrazyAra it was set to 8.

The Alpha-Beta engines except Stockfish achieved a Node per Seconds of 700k-2 Mio. Stockfish had 8 Mio NPS.

CrazyAra v0.3.1 using MCTS had a Node per Seconds of 300 using both CPU and GPU.

On CPU only using MXNET-mkl on a Intel® Core™ i5-8250U CPU @ 1.60GHz × 8 it achieved a NPS of 70.

Conditions

All these matches have been conducted using 5min/40 moves time control. Each engine played 34 games in a round robin tournament using the cute chess gui. The matches started from opening moves based on zh-50_startpos.pgn.

Hyperparameter Settings for MCTS

all default settings for v0.3.1 except context set from cpu to gpu:

  • context: gpu
  • use_raw_network: False
  • threads: 16
  • batch_size: 8
  • neural_net_services: 2
  • playouts_empty_pockets: 8192
  • playouts_filled_pockets: 8192
  • centi_cpuct: 250
  • centi_dirichlet_epsilon: 25
  • centi_dirichlet_alpha: 20
  • max_search_depth: 40
  • centi_temperature: 7
  • temperature_moves: 0
  • opening_guard_moves: 7
  • centi_clip_quantil: 0
  • virtual_loss: 3
  • centi_q_value_weight: 70
  • threshold_time_for_raw_net_ms: 100
  • move_overhead_ms: 300
  • moves_left: 40
  • extend_time_on_bad_position: True
  • max_move_num_to_reduce_movetime: 4
  • check_mate_in_one: False
  • use_pruning: True
  • use_oscillating_cpuct: False
  • use_time_management: True
  • verbose: False

Participants

Engine Name Estimated Elo Link
Sjeng 11.2 2300.00 https://github.com/gcp/sjeng, https://sjeng.org/download.html
SjaakII 1.4.1 2353.87 sudo apt-get install sjaakii, http://www.chess2u.com/t5010-sjaak
Sunsetter 9 2703.39 https://sourceforge.net/projects/sunsetter/files/latest/download
TJchess 1.3 2732.58 http://computer-chess.org/doku.php?id=computer_chess:wiki:lists:variants_engine_list
Stockfish 2018-10-13 3946.06 https://github.com/ddugovic/Stockfish

The elo estimates are taken from according to this tournament:

https://lichess.org/forum/team-crazyhouse-engine-development-and-game-analyses/swiss-crazyhouse-tournament?page=3

Result

Rank Name                                  Elo    +    - games score oppo. draws 
   1 stockfish-x86_64-modern  2018-10-13   502  285  150    34  100%   -96    0% 
   2 CrazyAra-0.3.1                         71  113  105    34   59%    -2    0% 
   3 TJchess_1.3_64bit_linux                -5  104   99    34   54%   -21    3% 
   4 Sunsetter-9                           -47  100   97    34   49%    -4    3% 
   5 sjeng                                -203  102  113    34   25%    63    3% 
   6 sjaakii                              -318  106  133    34   13%    60    3% 

All games can be downloaded here:

Superiority Matrix

                                     st Cr TJ Su sj sj
stockfish-x86_64-modern  2018-10-13     99 99 99 99 99
CrazyAra-0.3.1                        0    82 92 99 99
TJchess_1.3_64bit_linux               0 17    71 99 99
Sunsetter-9                           0  7 28    97 99
sjeng                                 0  0  0  2    90
sjaakii                               0  0  0  0  9   

The result table and superiority matrix has been created with BayesianElo.

Can CrazyAra 0.3.1 win vs Stockfish?

CrazyAra beat and drew Stockfish multiple times in 1. e4 e5 lines as white when Stockfish was only using 1 Mio NPS (1 thread) and 4096mb hash in 3min and 5min games or vs Stockfish AI lvl 8 provided by lichess.org. He you can find three example games for this:

CrazyAra also won a 20 min/40 moves game against Stockfish in a 1. d4 d5 line (1 thread)

Clone this wiki locally