How often do specific tactical themes happen in chess openings?
Of course, that question can not be answered precisely, because tactical motifs in chess are not clearly defined and often interwove.
But we can guess, and in order to do so, I wrote some CQL scripts which define some patterns, which are often seen in the opening phase. Using the scripts, I then searched in a PGN database with 372,000 GM games. The search was done within the first twenty moves of each game - the opening phase. Most games fit the themes quite well, though there are some false positives. But I wanted to keep the scripts simple.
Here is what I got:
The PGNs containing these games are inside the folder output.
Attached to this repository are:
- The CQL scripts used for the search, these are files with the extension *.cql.
- The input game database, input.pgn, it is contained in the input games.7z.
- The used CQL interpreter, cql.exe (version 6.0.5) contained in the cql interpreter.7z. If you are on Linux or macOS, replace this binary with a download from the official CQL website.
The input database consists of games, where
- one player is above 2700 ELO, or
- both are above 2500, or
- the game was played before 1900.
First extract the input games.7z and the cql interpreter.7z into this directory.
On Windows, to (re-)run a script on the database, drag the script on the convenience draghere.bat. The results will be written to the output folder, overwriting the results of previous runs.
Instead of the input.pgn you can of course also use your own PGN database. Name it input.pgn or edit the draghere.bat to match its name. For example, you could use your games played on Lichess.
It will take some time to run a script on a huge database. To speed up the queries, you can first run a generic script on the database, and then run the refined script on the output PGN of the first run. See the CQL command line reference for the available command line options.
- The official CQL Documentation is a complete reference of the CQL language.
- Haydoooke wrote a beginner-friendly documentation about CQL.
The scripts in this repository use the following CQL filters and symbols (leaving the obvious mathematical operators and the piece designators away). They are documented in the above links.
- Filters without parameters: wtm • btm • mate
- Filters with one parameter: flipcolor • result • movenumber • power •
#
- Filters with two parameters: attacks • attackedby • ray
- Filters with three parameters: pin
- Filters with a variable amount of parameters: line
- Piece / square / position matching filters:
a
•A
•_
•.
- Grouping filters:
{}
•[]
•()
• (whitespace between filters, which is an implicit AND) - Filters which are parts of other filters:
-->
•*
I put the scripts and the database into the public domain (chess games are anyway). The copyright owners of CQL are Gady Costeff and Lewis Stiller.
... go to Gady Costeff and Lewis Stiller for creating the CQL language, and to haydoooke for his documentation of CQL.
Written by Nils Lindemann in 2021-4-13. Last update: 2022.11.15