Change centipawn fallback to account for sharper WDL with high WDLCalibrationElo #2075

Naphthalin · 2024-10-11T14:19:20Z

The previously used centipawn formula #1193 is also used as fallback for WDL_mu introduced in #1791 mostly to reproduce the extreme centipawn values in clearly decisive positions users are accustomed to. The formula however was calibrated with the raw (unsharpened) WDL NN output, which means that together with WDL sharpening due to high WDLCalibrationElo it climbs too quickly, which also means it takes over faster than intended. WDL_mu without the fallback usually produces meaningful evals up to around +4.5, which is roughly where the fallback formula should take over. However, in the described scenario with WDLCalibrationElo: 3600 this already happens below +2, causing a discrepancy in behavior between Lc0 and SF eval roughly like this between Lc0 and SF in the range between +2 and +5 for SF where Leela regularly shows about double the eval of SF.

The fallback formula is therefore updated in two ways: 50% reduction in the scaling, while slightly increasing the constant to still meet +128 at wl=1.0.

With WDL sharpening at 3600 Elo (most commonly used value e.g. in TCEC both for playing and for kibitzing), the old centipawn calibration is off by about a factor 2 compared to Stockfish and generally takes over too quickly around +2.00 while it should only take over around +4.00 since up to there, `WDL_mu` behaves well enough. With lower calibration Elo (e.g. for analysis of human games / openings), the takeover point is significantly later due to lower Q from broader WDL, so this change doesn't affect anything. Doesn't yet fix the jumpy eval behavior in draws with very low W or L but substantial L resp. W remaining.

initial oversight: in a +1 position we want to display +128, that shouldn't change

…ibrationElo (#2075) * half eval fallback formula With WDL sharpening at 3600 Elo (most commonly used value e.g. in TCEC both for playing and for kibitzing), the old centipawn calibration is off by about a factor 2 compared to Stockfish and generally takes over too quickly around +2.00 while it should only take over around +4.00 since up to there, `WDL_mu` behaves well enough. With lower calibration Elo (e.g. for analysis of human games / openings), the takeover point is significantly later due to lower Q from broader WDL, so this change doesn't affect anything. Doesn't yet fix the jumpy eval behavior in draws with very low W or L but substantial L resp. W remaining. * changed factor to +128 convention initial oversight: in a +1 position we want to display +128, that shouldn't change

…ibrationElo (LeelaChessZero#2075) * half eval fallback formula With WDL sharpening at 3600 Elo (most commonly used value e.g. in TCEC both for playing and for kibitzing), the old centipawn calibration is off by about a factor 2 compared to Stockfish and generally takes over too quickly around +2.00 while it should only take over around +4.00 since up to there, `WDL_mu` behaves well enough. With lower calibration Elo (e.g. for analysis of human games / openings), the takeover point is significantly later due to lower Q from broader WDL, so this change doesn't affect anything. Doesn't yet fix the jumpy eval behavior in draws with very low W or L but substantial L resp. W remaining. * changed factor to +128 convention initial oversight: in a +1 position we want to display +128, that shouldn't change

Naphthalin added 2 commits October 11, 2024 13:49

changed factor to +128 convention

b0a02e1

initial oversight: in a +1 position we want to display +128, that shouldn't change

borg323 approved these changes Oct 12, 2024

View reviewed changes

borg323 merged commit cab6395 into LeelaChessZero:master Oct 18, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change centipawn fallback to account for sharper WDL with high WDLCalibrationElo #2075

Change centipawn fallback to account for sharper WDL with high WDLCalibrationElo #2075

Naphthalin commented Oct 11, 2024 •

edited

Loading

Change centipawn fallback to account for sharper WDL with high WDLCalibrationElo #2075

Change centipawn fallback to account for sharper WDL with high WDLCalibrationElo #2075

Conversation

Naphthalin commented Oct 11, 2024 • edited Loading

Naphthalin commented Oct 11, 2024 •

edited

Loading