-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to extend the format #1
Comments
Thanks for your interest, Max! The very simple file reading I put in the code Line 4 in 04625a1
| . You end options with a semicolon, whereas Knuth uses a newline character only.
You're right that these kinds of text files are not the most efficient input format for running the solver algorithm due to the need for the various hash-sets. However, in most cases parsing input is not the time-critical part of the solve (particularly for large problems), so I think most users don't need to worry about optimising the parsing step. For the xcover/src/xcover/dancing_cells.py Line 9 in 04625a1
options , options_ptr , and colors , which represent the options and their colourings in a sparse array format of unsigned integers (like how a CSC sparse matrix is stored) and the two unsigned integers representing the numbers of primary and secondary items. Note that xcover differs from miniexact in using dancing cells rather than dancing links to solve. I think the integer-based DIMACS format you use can straightforwardly be parsed to give the three arrays and two integers. I'm rather busy with other things right now, but when I get chance I'll see if I can have a play with writing a simple parser for your DIMACS format to produce the sparse arrays directly.
For a user interested in getting absolutely the best speed out of |
Thank you for your reply! Yes, the parsing is not the most performance-critical part of the code. I forgot to mention that I also meant to decrease the complexity of the parsers and the formats themselves - less different decisions during implementing the parsing mean different solvers are more aligned and process the inputs more predictably. I think this would help proliferate using Exact Cover to encode problems, as the barrier to entry decreases. Thank you for pointing out dancing cells. I now read a bit through the new algorithm and think about integrating it into miniexact, let's see how hackable it really is. Please don't stress yourself about your parser, I would have made a PR myself if I already had had the time to implement it in Python. I am happy you also think the integer-based DIMACS format could be easily parsed and that it would be a good addition! If more solvers support our common format, we could aim to build an Exact Cover Gallery at some point, showcasing different solvers and algorithms and maybe showing their different strengths, inspired by what the SAT competition is contributing. I am sure there would be interest in such an overview. |
Very interesting project! I'm happy to see more work being done on exact-cover solving.
Some time ago I also created an exact-cover solver supporting Algorithm X, C, C$ (a bit), and M: miniexact. The two solvers have very similar file formats already, although we have slightly different ways to specify primary and secondary items.
I think it would be good to build a more common standard that may be used by many solvers, so that implementations get more comparable. I think your idea of just listing primary items in the first line is very good, as it still does not require doing multiple passes over the whole input - I think I could directly integrate this into Miniexact. I don't know if this format also scales to multiplicities though, as it complicates the input a bit.
Another possibility is the DIMACS-esque input format from my README. This optimizes the parsing step, similar to how SAT solvers work. If one uses Python anyway to generate the problem and then interpret the answer, such a specialized format might make the whole process more efficient, compared to forcing the use of many hash-sets in-between. I would be happy to hear about your opinion!
In general, I think it would be nice to create a collection of problems in a common format so that implementations and new ideas may be compared in a good way! :)
All the best, Max
The text was updated successfully, but these errors were encountered: