-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong evaluation of draw position #4103
Comments
I think this is an interesting hard puzzle for engines in that it basically needs to resolve to the 50 move rule to see the draw evaluation. For this one needs to extend search very deep, which causes something like #3911 .. on the other hand, the right move is found almost instantly. |
Do you have the full solution? SF14 also eventually drops to near 0. |
This is the line that I got doing some fast analysis
But then the latest dev refuted the whole thing with 24.Rf3 which is winning for white |
Keep in mind that lichess SF14 is using a smaller net that's a few elo worse on average. It looks like it missed 24.Rf3 as mentioned above. |
That is the wrong conclusion. It's a draw because black can play 22... Rxf6 |
@MaiaChess I will assume you're legit asking Second Lichess uses another net and maybe a modified binary for SF to run smoothly |
Simply black is lost because it got outplayed a bit by a bit, there are millions/billions of possibilities |
For god sake what do you mean by a Lichess server analysis? |
Lichess uses fixed nodes analysis at low hash size, it means nothing in engine vs engine play. |
@MaiaChess Please don't use Lichess's server analysis for engine vs engine games. |
The position is completely lost. It is subjective what the best move to try to save the game is. Neither is really wrong or right. And if you want to argue subjective, Leela also prefers e4. -5 to -11. |
@MaiaChess are you impersonating the creators of https://github.com/CSSLab/maia-chess? |
I should have grabbed popcorn for this... Your view of Computer Engine Chess is a bit simplistic. SF's NN is tuned on ~5k CPUs. The tests you gave weren't even performed on TCEC equipment much less all of Fishtest. The accuracy of a specific move in the middle game isn't objective... it's subjective to the ideas being calculated vs what the opponent is actually doing. If your opponent already played this game and won and it remembered the game perfectly, do you expect an engine with very limited computing power to be able think far enough ahead to draw these games? I expect SF using max threads and hash and a 7man TB to be able to correctly evaluate the positions you gave. The only problem is you're expecting moves that were calculated by a supercomputer to be refuted by your PC at low depth, low hash, and low thread count --without an EGTB. Should there be better scalability between what SF can do on a supercomputer vs your PC? yes |
@MaiaChess, what is it that you're trying to convey? Do you believe you have found an actual issue in the source code? Do you believe you have found a systemic issue in the latest neural network? |
what does this even mean? it's too vague to tell and there's no point in comparing engine calculations to human calculations. maybe Kasparov lost a simul or blitz game to an amateur once, but it sounds more like a made up urban myth |
????????????? |
The original position that started this issue seems to no longer be a problem. Latest stockfish finds ne4 and evaluates it as 0.00 almost instantly on a single thread. Perhaps can be closed. |
I am getting -2.08 on a single thread. Multithread seems to flatline at some random eval, sometimes 0. But either way, the issue tracker might not be the best place for positions SF gets wrong. |
I'm closing this in light of the last two comments |
Position: q7/8/2p5/B2p2pp/5pp1/2N3k1/6P1/7K w - - 0 1
The latest version (5/07/2022) miss the draw and evaluate -4.8
Compared to an older version like 14/05/2022 that see the draw and evaluate 0.00.
The text was updated successfully, but these errors were encountered: