Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Naphthalin · 2024-01-12T13:26:54Z

During implementing #1791 we used a preliminary version in TCEC Swiss 4 together with a net with a mixed training set, and found that the Contempt effect was sometimes exaggerated. To address this, a number of measures was taken:

limit the Contempt taken from UCI_RatingAdv via a hidden setting ContemptMaxValue
limit the WDL derived "sharpness" (called s for scale in the WDLRescale function, following the logistic distribution nomenclature) to a hardcoded value of 1.4 (approx. twice the value of startpos)
redefine how the contempt effect is calculated for high rating differences, also making the accuracy Elo dependent, thus making high Contempt values safe and reasonable to use
ultimately, only use networks with "pure datasets" with higher contempt values.

While this together fixed the problem for good, any 2 of the 4 measures combined would likely already have helped, and it turns out that the hardcoded limit of 1.4 is a bit too conservative, which is especially noticeable when using it for material odds like in https://lczero.org/blog/2023/11/play-with-knight-odds-against-lc0-on-lichess/. This PR allows increasing the limit, thus addressing the original comment on the hardcoded constant.

…e Elo path

…erWDL

Naphthalin · 2024-01-22T14:46:59Z

While testing this PR, I found that there was an actual bug in the way the diff parameter is calculated from WDLCalibrationElo and Contempt, forgetting to divide by WDLDrawRateReference^2 which effectively reduced the contempt effect for the bigger nets by a factor of up to 2, while using contempt without calibration Elo was unaffected. This PR now fixes this unintended behaviour, so using either of the two ways introduced in #1791 for specifying contempt works now as intended. Note however that WDLCalibrationElo still refers to game pair Elo, which results in a discrepancy from regular Elo up to a factor of 2 below 2600. To counteract that a difference of 100 Elo with WDLCalibrationElo: 1800 basically means 200 regular Elo difference, you can simply use "WDLContemptAttenuation": 0.5 and use the real Elo difference nonetheless. Around 2400, the correct value is probably around 0.8; I will attempt to fix this properly together with updating the Elo dependent draw rate curve at some point in the future.

Naphthalin · 2024-01-22T14:49:20Z

src/mcts/search.cc

@@ -313,7 +313,7 @@ void Search::SendUciInfo() REQUIRES(nodes_mutex_) REQUIRES(counters_mutex_) {
          contempt_mode_ == ContemptMode::NONE
              ? 0
              : params_.GetWDLRescaleDiff() * params_.GetWDLEvalObjectivity(),
-          sign, true);
+          sign, true, params_.GetWDLMaxReasonableS());


is there any convention on the order of arguments, i.e. does it make more sense to put parameters first and flags last?

Naphthalin · 2024-01-26T11:20:06Z

The last two commits added a conversion formula, translating regular Elo (as defined by the expected outcome following a logistic curve) which is also used when the alternative Contempt settings where WDLDrawRateTarget is set instead of WDLCalibrationElo to the internally used Elo (derived from game pair ratio, equivalent at higher levels to UHO Elo and more importantly UHO game pair level).

It still is supposed to represent (relatively fast) rapid Elo, so to get classic Elo, add something between 40 and 70 Elo per time doubling.

The conversion formula is an approximation to the model prediction for regular Elo from +1.00 openings, which itself is based on Stockfish level selfplay data to estimate the approximate draw rate resp. WDL sharpness, using official-stockfish/Stockfish#4341.

…--> Contempt calculation (LeelaChessZero#1941) (cherry picked from commit 5d83073)

…--> Contempt calculation (LeelaChessZero#1941)

Naphthalin and others added 10 commits December 27, 2023 21:50

increased reasonable s for material odds

4d65aae

build onnx-dml first

1394b8e

turned max reasonable s into a parameter

0ddee52

don't change appveyor.yml

d49d29b

don't change appveyor.yml

44cf5c4

Merge branch 'master' into widerWDL

cf86bd4

Merge branch 'master' into widerWDL

f55cb72

found (and hopefully fixed) wrong handling of diff calculation for th…

c8bdc57

…e Elo path

Merge branch 'widerWDL' of https://github.com/Naphthalin/lc0 into wid…

1dd6ab3

…erWDL

something is still off, experimenting

fbc9f85

Naphthalin added enhancement New feature or request not for merge Experimental code which is not intended to be merged into the master bug fix Fixes intended behavior labels Jan 19, 2024

Naphthalin commented Jan 22, 2024

View reviewed changes

Naphthalin added 2 commits January 25, 2024 13:27

Added conversion of Elo from regular to internally used game pair ratio

794d313

removed float

ab3b330

Naphthalin removed the not for merge Experimental code which is not intended to be merged into the master label Jan 26, 2024

Merge branch 'master' into widerWDL

d623083

Naphthalin changed the title ~~Make the sharpness limit in WDLRescale configurable~~ Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation Feb 20, 2024

borg323 approved these changes Feb 20, 2024

View reviewed changes

Shortened WDLMaxReasonableS to WDLMaxS

58ff602

borg323 merged commit 5d83073 into LeelaChessZero:master Feb 21, 2024
3 checks passed

PikaCat-OuO pushed a commit to official-pikafish/px0 that referenced this pull request Feb 23, 2024

Make the sharpness limit in WDLRescale configurable, and fix the Elo …

615e086

…--> Contempt calculation (LeelaChessZero#1941) (cherry picked from commit 5d83073)

borg323 pushed a commit to borg323/lc0 that referenced this pull request Feb 26, 2024

Make the sharpness limit in WDLRescale configurable, and fix the Elo …

5350a2e

…--> Contempt calculation (LeelaChessZero#1941)

Naphthalin mentioned this pull request Sep 6, 2024

Add normalized game pair Elo to stats official-stockfish/fishtest#2134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Naphthalin commented Jan 12, 2024

Naphthalin commented Jan 22, 2024

Naphthalin Jan 22, 2024

Naphthalin commented Jan 26, 2024

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Conversation

Naphthalin commented Jan 12, 2024

Naphthalin commented Jan 22, 2024

Naphthalin Jan 22, 2024

Choose a reason for hiding this comment

Naphthalin commented Jan 26, 2024