Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove puzzles that share the same solution #12

Open
Belzedar94 opened this issue May 6, 2022 · 3 comments
Open

Remove puzzles that share the same solution #12

Belzedar94 opened this issue May 6, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@Belzedar94
Copy link
Contributor

To avoid slight variations that lead to the same exact solution as chapters 2-3 of https://lichess.org/study/vxrJlFCV

@ianfab ianfab added the enhancement New feature or request label May 6, 2022
@gbtami
Copy link
Collaborator

gbtami commented Jul 19, 2023

For example I find this duplicate puzzle in autocorr created chennis collection (only difference is the so called supplied move "sm"):

2k4/1p+f4/3+s3/7/3m3/3+F1P1/4KM1[s] w - - 0 7;variant chennis;sm d2d5-;bm +S@d2;eval #2;difficulty 0.000;content 2.338;quality 0.197;volatility 0.000;volatility2 0.017;accuracy 0.000;accuracy2 0.071;std 0.000;ambiguity 0.000;type mate;pv d2d5-,+S@d2,e1d1,c6c2-
2k4/1p+f4/3+s3/7/3m3/5P1/3+FKM1[s] w - - 0 7;variant chennis;sm d1d5-;bm +S@d2;eval #2;difficulty 0.061;content 2.279;quality 0.194;volatility 0.012;volatility2 0.018;accuracy 0.012;accuracy2 0.060;std 0.000;ambiguity 0.000;type mate;pv d1d5-,+S@d2,e1d1,c6c2-

@ianfab
Copy link
Owner

ianfab commented Jul 21, 2023

I would like to distinguish a few criteria here:

  1. identical starting position
  2. identical solution line
  3. identical final position

More sophisticated duplicates are unrealistic to cover I think, and also are within the lichess puzzles, e.g., the exact same mating pattern on different squares or with different material configurations.

For puzzles with identical starting positions, with default generator settings this should not occur, since it filters duplicate FENs already there. In the case of including the supplied move the duplicate filter could potentially be improved by using the resulting FEN for filtering instead of the FEN + move pair.

if (fen, bestmove) not in fens:
fens.add((fen, bestmove))

It might be a small hit on performance because you need to compute another FEN, but code-wise should be easy to change.

Identical solution lines from different starting positions are rather difficult to filter, since it is very hard to tell if the pattern is really the same. Also it might not even be a bad idea to have the same pattern in different contexts to generalize the pattern recognition. The only place where I used a very specific filtering of this kind so far was for Manchu, since probably 90% of your checkmate puzzles will just be the banner landing on c7/g7, which is super dull. For other variants I haven't seen anything like this though.

With regards to identical final positions, the main pattern occurring in practice is that a forced mate in n happened in the game. It will then report the mate in n, mate in n-1, ..., and mate in 1 all as separate puzzles. This looks very repetitive on a small scale when you look at the ordered list of resulting puzzles, and that is what some people have heavily criticized, but once puzzles are unordered/randomized I think their similarity hardly is a problem. Actually, having the same puzzle on different levels of difficulty IMO can be very useful. So I don't see a strong need for such a filtering.

So all in all the only minor improvement I currently see is to fix the duplicate filtering for the scenario of using the supplied move notation. Other than that apart from very specific problems like the Manchu one I do not consider duplicates a big problem so far.

@gbtami
Copy link
Collaborator

gbtami commented Jul 21, 2023

Telling the truth, I just posted above example just as a curiosity. I completely agree with you. There is no real need to fix the supplied move case. I can delete one of them, if it occurs again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

3 participants