Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation #1941

Merged
merged 14 commits into from
Feb 21, 2024

Conversation

Naphthalin
Copy link
Contributor

During implementing #1791 we used a preliminary version in TCEC Swiss 4 together with a net with a mixed training set, and found that the Contempt effect was sometimes exaggerated. To address this, a number of measures was taken:

  • limit the Contempt taken from UCI_RatingAdv via a hidden setting ContemptMaxValue
  • limit the WDL derived "sharpness" (called s for scale in the WDLRescale function, following the logistic distribution nomenclature) to a hardcoded value of 1.4 (approx. twice the value of startpos)
  • redefine how the contempt effect is calculated for high rating differences, also making the accuracy Elo dependent, thus making high Contempt values safe and reasonable to use
  • ultimately, only use networks with "pure datasets" with higher contempt values.

While this together fixed the problem for good, any 2 of the 4 measures combined would likely already have helped, and it turns out that the hardcoded limit of 1.4 is a bit too conservative, which is especially noticeable when using it for material odds like in https://lczero.org/blog/2023/11/play-with-knight-odds-against-lc0-on-lichess/. This PR allows increasing the limit, thus addressing the original comment on the hardcoded constant.

@Naphthalin Naphthalin added enhancement New feature or request not for merge Experimental code which is not intended to be merged into the master bug fix Fixes intended behavior labels Jan 19, 2024
@Naphthalin
Copy link
Contributor Author

While testing this PR, I found that there was an actual bug in the way the diff parameter is calculated from WDLCalibrationElo and Contempt, forgetting to divide by WDLDrawRateReference^2 which effectively reduced the contempt effect for the bigger nets by a factor of up to 2, while using contempt without calibration Elo was unaffected. This PR now fixes this unintended behaviour, so using either of the two ways introduced in #1791 for specifying contempt works now as intended. Note however that WDLCalibrationElo still refers to game pair Elo, which results in a discrepancy from regular Elo up to a factor of 2 below 2600. To counteract that a difference of 100 Elo with WDLCalibrationElo: 1800 basically means 200 regular Elo difference, you can simply use "WDLContemptAttenuation": 0.5 and use the real Elo difference nonetheless. Around 2400, the correct value is probably around 0.8; I will attempt to fix this properly together with updating the Elo dependent draw rate curve at some point in the future.

@@ -313,7 +313,7 @@ void Search::SendUciInfo() REQUIRES(nodes_mutex_) REQUIRES(counters_mutex_) {
contempt_mode_ == ContemptMode::NONE
? 0
: params_.GetWDLRescaleDiff() * params_.GetWDLEvalObjectivity(),
sign, true);
sign, true, params_.GetWDLMaxReasonableS());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any convention on the order of arguments, i.e. does it make more sense to put parameters first and flags last?

@Naphthalin
Copy link
Contributor Author

The last two commits added a conversion formula, translating regular Elo (as defined by the expected outcome following a logistic curve) which is also used when the alternative Contempt settings where WDLDrawRateTarget is set instead of WDLCalibrationElo to the internally used Elo (derived from game pair ratio, equivalent at higher levels to UHO Elo and more importantly UHO game pair level).

It still is supposed to represent (relatively fast) rapid Elo, so to get classic Elo, add something between 40 and 70 Elo per time doubling.

The conversion formula is an approximation to the model prediction for regular Elo from +1.00 openings, which itself is based on Stockfish level selfplay data to estimate the approximate draw rate resp. WDL sharpness, using official-stockfish/Stockfish#4341.

Elo_approximation2

@Naphthalin Naphthalin removed the not for merge Experimental code which is not intended to be merged into the master label Jan 26, 2024
@Naphthalin Naphthalin changed the title Make the sharpness limit in WDLRescale configurable Make the sharpness limit in WDLRescale configurable, and fix the Elo --> Contempt calculation Feb 20, 2024
@borg323 borg323 merged commit 5d83073 into LeelaChessZero:master Feb 21, 2024
3 checks passed
PikaCat-OuO pushed a commit to official-pikafish/px0 that referenced this pull request Feb 23, 2024
borg323 pushed a commit to borg323/lc0 that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Fixes intended behavior enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants