Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search tuning at very long time control. #3937

Closed

Conversation

Vizvezdenec
Copy link
Contributor

This patch is a result of tuning done by @candirufish after 150k games.
Since results were really interesting and touched heuristics that are known for it non-linear scaling I decided to run limited games LTC test even with really bad STC (which I expected to be really bad) and seeing it results I also run VLTC SPRT.
The main difference is in extensions, this patch allows much more singular/double extensions, both in terms of allowing them at lower depths and with lesser margins.
Failed STC:
https://tests.stockfishchess.org/tests/view/620d66643ec80158c0cd3b46
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 4968 W: 1194 L: 1398 D: 2376
Ptnml(0-2): 47, 633, 1294, 497, 13
Performed well at LTC:
https://tests.stockfishchess.org/tests/view/620d66823ec80158c0cd3b4a
ELO: 3.36 +-1.8 (95%) LOS: 100.0%
Total: 30000 W: 7966 L: 7676 D: 14358
Ptnml(0-2): 36, 2936, 8755, 3248, 25
Passed VLTC SPRT:
https://tests.stockfishchess.org/tests/view/620da11a26f5b17ec884f939
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 4400 W: 1326 L: 1127 D: 1947
Ptnml(0-2): 13, 309, 1348, 526, 4
bench 6318903

@snicolet snicolet closed this in 84b1940 Feb 17, 2022
@snicolet
Copy link
Member

Merged via 84b1940, congrats :-)

@snicolet snicolet added the to be merged Will be merged shortly label Feb 17, 2022
@ppigazzini
Copy link
Contributor

Since results were really interesting and touched heuristics that are known for it non-linear scaling I decided to run limited games LTC test even with really bad STC (which I expected to be really bad) and seeing it results I also run VLTC SPRT.

Viz was very clear in the commit message about the deviation from the fishtest rules, the very bad behavior at STC and the cherry picking of the VLTC SPRT. Perhaps the right thing to do was to open an Issue to discuss these speculative tests that show the possibility to get improvement at VLTC.

I'm worried that the new master is biasing all the new tests, because respect to the previous master is weak in SPRT at STC and perhaps on par at LTC, so IMO better to talk about a revert.

@G-Lorenz
Copy link
Contributor

This commit was tested also at smp conditions, here are the results, for future reference:

SMP-STC (5+0.05 th 8): https://tests.stockfishchess.org/tests/view/620e8d5226f5b17ec885144b
ELO: -12.23 +-3.3 (95%) LOS: 0.0%
Total: 10006 W: 2461 L: 2813 D: 4732
Ptnml(0-2): 34, 1249, 2766, 943, 11

SMP-LTC (20+0.2 th 8): https://tests.stockfishchess.org/tests/view/620e8d7a26f5b17ec8851450
ELO: 1.32 +-2.2 (95%) LOS: 88.5%
Total: 20000 W: 5377 L: 5301 D: 9322
Ptnml(0-2): 17, 1909, 6062, 2005, 7

SMP-VLTC (60+0.6 th 8): https://tests.stockfishchess.org/tests/view/620e8eb226f5b17ec8851462
ELO: 17.14 +-4.2 (95%) LOS: 100.0%
Total: 4666 W: 1427 L: 1197 D: 2042
Ptnml(0-2): 2, 299, 1499, 533, 0

@dav1312
Copy link
Contributor

dav1312 commented Feb 20, 2022

Other Tests of this commit

60+0.6 th 1 https://tests.stockfishchess.org/tests/view/6210f2e5b1792e8985f86e01
Book: 8moves_v3
ELO: 33.07 +-1.0 (95%) LOS: 100.0%
Total: 60000 W: 7372 L: 1679 D: 50949
Ptnml(0-2): 17, 984, 22522, 6243, 234

Previous RT:
60+0.6 th 1 https://tests.stockfishchess.org/tests/view/620562ffd71106ed12a449a6
Book: 8moves_v3
ELO: 34.88 +-1.0 (95%) LOS: 100.0%
Total: 60000 W: 7532 L: 1528 D: 50940
Ptnml(0-2): 13, 884, 22430, 6432, 241


30+0.3 th 8 https://tests.stockfishchess.org/tests/view/62115f93b1792e8985f87eb3
Book: 8moves_v3
ELO: 23.83 +-1.0 (95%) LOS: 100.0%
Total: 40000 W: 3535 L: 796 D: 35669
Ptnml(0-2): 4, 410, 16496, 3023, 67

Previous RT:
30+0.3 th 8 https://tests.stockfishchess.org/tests/view/620822bed71106ed12a4af70
Book: 8moves_v3
ELO: 24.37 +-1.0 (95%) LOS: 100.0%
Total: 40000 W: 3566 L: 765 D: 35669
Ptnml(0-2): 2, 416, 16441, 3061, 80


180+1.8 th 1 https://tests.stockfishchess.org/tests/view/62120c34b1792e8985f89b1a
Book: UHO_XXL_+0.90_+1.19
ELO: 86.00 +-3.0 (95%) LOS: 100.0%
Total: 10000 W: 3799 L: 1373 D: 4828
Ptnml(0-2): 0, 203, 2213, 2539, 45

Previous commit:
180+1.8 th 1 https://tests.stockfishchess.org/tests/view/62120c8cb1792e8985f89b31
Book: UHO_XXL_+0.90_+1.19
ELO: 81.22 +-3.1 (95%) LOS: 100.0%
Total: 10000 W: 3740 L: 1444 D: 4816
Ptnml(0-2): 1, 263, 2211, 2489, 36


https://nextchessmove.com/dev-builds/84b1940fcae95bb0a641dda9e85cb96f8c21cd22
https://nextchessmove.com/dev-builds/2da1d1bf571e3fd2e1d6cf56b76a7504de1a9453
https://nextchessmove.com/dev-builds/abef3e86f42fd4953d28cc7c3381601475d11346
24.5+0.24 th 2
Book: 8moves_v3
ELO: 431.16 +-3.6 (95%)
Total: 60000 W: 50903 L: 159 D: 8938

Previous commit:
24.5+0.24 th 2 https://nextchessmove.com/dev-builds/3ec6e1d2450183ed4975cf569b5a1286cb9d8369
Book: 8moves_v3
ELO: 447.43 +-6.44 (95%)
Total: 20000 W: 17197 L: 26 D: 2777

@vondele vondele mentioned this pull request Feb 20, 2022
@vondele vondele mentioned this pull request Apr 18, 2022
3 tasks
dav1312 pushed a commit to dav1312/Stockfish that referenced this pull request Oct 21, 2022
This patch is a result of tuning done by user @candirufish after 150k games.

Since the tuned values were really interesting and touched heuristics
that are known for their non-linear scaling I decided to run limited
games LTC match, even if the STC test was really bad (which was expected).
After seeing the results of the LTC match, I also run a VLTC (very long
time control) SPRTtest, which passed.

The main difference is in extensions: this patch allows much more
singular/double extensions, both in terms of allowing them at lower
depths and with lesser margins.

Failed STC:
https://tests.stockfishchess.org/tests/view/620d66643ec80158c0cd3b46
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 4968 W: 1194 L: 1398 D: 2376
Ptnml(0-2): 47, 633, 1294, 497, 13

Performed well at LTC in a fixed-length match:
https://tests.stockfishchess.org/tests/view/620d66823ec80158c0cd3b4a
ELO: 3.36 +-1.8 (95%) LOS: 100.0%
Total: 30000 W: 7966 L: 7676 D: 14358
Ptnml(0-2): 36, 2936, 8755, 3248, 25

Passed VLTC SPRT test:
https://tests.stockfishchess.org/tests/view/620da11a26f5b17ec884f939
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 4400 W: 1326 L: 1127 D: 1947
Ptnml(0-2): 13, 309, 1348, 526, 4

closes official-stockfish#3937

Bench: 6318903
Joachim26 pushed a commit to Joachim26/StockfishNPS that referenced this pull request Nov 22, 2023
This patch is a result of tuning done by user @candirufish after 150k games.

Since the tuned values were really interesting and touched heuristics
that are known for their non-linear scaling I decided to run limited
games LTC match, even if the STC test was really bad (which was expected).
After seeing the results of the LTC match, I also run a VLTC (very long
time control) SPRTtest, which passed.

The main difference is in extensions: this patch allows much more
singular/double extensions, both in terms of allowing them at lower
depths and with lesser margins.

Failed STC:
https://tests.stockfishchess.org/tests/view/620d66643ec80158c0cd3b46
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 4968 W: 1194 L: 1398 D: 2376
Ptnml(0-2): 47, 633, 1294, 497, 13

Performed well at LTC in a fixed-length match:
https://tests.stockfishchess.org/tests/view/620d66823ec80158c0cd3b4a
ELO: 3.36 +-1.8 (95%) LOS: 100.0%
Total: 30000 W: 7966 L: 7676 D: 14358
Ptnml(0-2): 36, 2936, 8755, 3248, 25

Passed VLTC SPRT test:
https://tests.stockfishchess.org/tests/view/620da11a26f5b17ec884f939
LLR: 2.96 (-2.94,2.94) <0.50,3.00>
Total: 4400 W: 1326 L: 1127 D: 1947
Ptnml(0-2): 13, 309, 1348, 526, 4

closes official-stockfish#3937

Bench: 6318903
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants