Skip to content

Commit 21026ff

Browse files
author
yu_wang
committed
Update on README
1 parent b3e0b2e commit 21026ff

6 files changed

+239
-118
lines changed

README.md

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ Sudoku puzzles are more than just a game—they are a **rich reasoning benchmark
3636
- Grids can be serialized in multiple ways (e.g., cell-level, row-level, or grid-level).
3737
- This allows researchers to explore the **optimal input format for structured data** in LLMs.
3838
- Sudoku4LLM supports 11 different serialization formats, making it a versatile tool for studying structured data representation.
39+
- Researchers can easily design custom serialization formats or explore new structured data representations.
3940

4041
### 5. **Resistance to Memorization**
4142
- With **infinite variability** in puzzle generation, Sudoku puzzles are highly resistant to memorization, ensuring that models are genuinely reasoning rather than recalling.
@@ -182,24 +183,16 @@ Modify `config.py` to adjust default settings, including:
182183

183184
---
184185

185-
## 🤝 Acknowledgements
186-
187-
We would like to thank the following contributors, projects, and resources that inspired or supported this work:
188-
189-
- [Add acknowledgements here.]
190-
191-
---
192-
193186
## 📜 Citation
194187

195188
If you use **Sudoku4LLM** in your research, please cite us:
196189

197190
```bibtex
198191
@misc{Sudoku4LLM,
199-
author = {Your Name},
192+
author = {Yu Wang},
200193
title = {Sudoku4LLM: A Dataset Generator for Training and Evaluating Reasoning LLMs},
201194
year = {2025},
202-
url = {https://github.com/your-repo/Sudoku4LLM},
195+
url = {https://github.com/DolbyUUU/Sudoku4LLM},
203196
note = {Version 1.0}
204197
}
205198
```

format_convertor.py

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# Sudoku4LLM/format_convertor.py
2+
13
import json
24
import os
35
from config import SudokuConfig # Importing the config.py module for format options
@@ -14,7 +16,8 @@ def load_puzzles(self):
1416
try:
1517
with open(self.input_jsonl, "r") as file:
1618
for line in file:
17-
puzzles.append(json.loads(line)["puzzle"])
19+
puzzle_data = json.loads(line)
20+
puzzles.append(puzzle_data)
1821
except FileNotFoundError:
1922
print(f"Error: Input file '{self.input_jsonl}' not found.")
2023
exit(1)
@@ -105,7 +108,7 @@ def convert_to_xml(self, puzzle):
105108
return "\n".join(rows)
106109

107110
def convert(self, format_choice):
108-
"""Convert puzzles to the selected format."""
111+
"""Convert puzzles to the selected format, include game_rule and directly use config."""
109112
puzzles = self.load_puzzles()
110113
converted_puzzles = []
111114

@@ -132,11 +135,21 @@ def convert(self, format_choice):
132135
description, format_function = format_methods[format_choice]
133136

134137
# Apply the selected format method to each puzzle
135-
for puzzle in puzzles:
138+
for puzzle_data in puzzles:
139+
puzzle = puzzle_data["puzzle"]
140+
config = puzzle_data["config"] # Directly use the "config" from the original data
141+
142+
# Add game rule based on grid size
143+
grid_size = config["grid_size"]
144+
game_rule = SudokuConfig.get_configs().get(f"{grid_size}x{grid_size}", {}).get("rules", "Unknown rules").strip()
145+
146+
# Build the converted puzzle data
136147
converted_puzzles.append({
137-
"original_puzzle": puzzle, # Include the original puzzle for reference
148+
"original_puzzle": puzzle, # Include the original puzzle
138149
"converted_puzzle": format_function(puzzle), # Converted puzzle
139-
"format": description # Metadata: format name
150+
"format": description, # Metadata: format name
151+
"game_rule": game_rule, # Game rule
152+
"config": config # Directly include the original config
140153
})
141154

142155
# Save all converted puzzles in JSONL format

0 commit comments

Comments
 (0)