BoLoCo is an enhanced toolkit for generating Boolean logic expression datasets with rich metadata, designed for training and evaluating logical reasoning capabilities in AI models. Version 2.0 introduces comprehensive JSON/JSONL data formats, HuggingFace integration, and enhanced metadata tracking.
- 🎯 JSON/JSONL Formats: Rich structured data with comprehensive metadata
- 📊 Enhanced Metadata: Automatic complexity scoring, operator analysis, nesting depth
- 🔄 Multiple Formats: Support for JSON, JSONL, and HuggingFace formats
- 📝 Auto-Generated Dataset Cards: HuggingFace-compatible documentation
- ✅ Input Validation: Comprehensive error checking and dataset validation
- 🎨 Rich CLI Experience: Beautiful output with progress indicators (optional)
- 🤗 HuggingFace Ready: Direct compatibility with
datasetslibrary - 🔀 Single CLI: Clean, focused interface
- 🧠 AI Research: Training logical reasoning models
- 📚 Educational: Teaching Boolean logic concepts
- 🔬 Benchmarking: Evaluating model logical capabilities
- 🏗️ Synthetic Data: Generating structured logical datasets
- 🎮 Game AI: Rule-based system training
git clone https://github.com/klusai/boloco.git
cd boloco
poetry install # Basic installationpoetry install --extras enhanced # Adds HuggingFace + Rich CLIpoetry install --extras full # Includes transformers for advanced featurespoetry install --with dev # All development tools included
make dev-setup # Complete development environmentpip install boloco # From PyPI (when published)
pip install "boloco[enhanced]" # With enhanced features
pip install "boloco[full]" # With all features# Generate a dataset with rich metadata
python3 -m boloco.cli generate --max-tokens 5 --output-dir ./data
# Generate with specific error ratio and format
python3 -m boloco.cli generate \
--max-tokens 7 \
--error-ratio 0.1 \
--output-dir ./my_dataset \
--format jsonl
# Generate with all formats
python3 -m boloco.cli generate --max-tokens 5 --output-dir ./data --format all
# Note: After installation with Poetry, you can also use 'poetry run boloco' or just 'boloco' directly{
"expression": "( T OR F ) AND NOT F",
"evaluation": "T",
"tokens": ["(", "T", "OR", "F", ")", "AND", "NOT", "F"],
"metadata": {
"token_count": 8,
"operator_count": 3,
"literal_count": 3,
"nesting_depth": 1,
"has_negation": true,
"is_error": false,
"complexity_score": 15.0
},
"reasoning_steps": [],
"error_type": null,
"created_at": "2025-01-15T10:30:00Z"
}- Complexity Scoring: Automated difficulty assessment based on multiple factors
- Operator Analysis: Count and distribution of logical operators (AND, OR, NOT)
- Structural Analysis: Nesting depth, parentheses usage, token counting
- Error Classification: Systematic categorization of invalid expressions
- Provenance Tracking: Complete generation history and configuration
from datasets import load_dataset
# Load from generated files
dataset = load_dataset("json", data_files={
"train": "data/dataset_train.jsonl",
"validation": "data/dataset_validation.jsonl",
"test": "data/dataset_test.jsonl"
})
# Access examples with rich metadata
for example in dataset["train"]:
print(f"Expression: {example['expression']}")
print(f"Result: {example['evaluation']}")
print(f"Complexity: {example['metadata']['complexity_score']}")
print(f"Has negation: {example['metadata']['has_negation']}")from boloco.enhanced import BoLoCoDataset, BoLoCoExample
from boloco.cli import BoLoCoGenerator
# Configure generation
config = {
"max_tokens": 7,
"error_ratio": 0.1,
"train_ratio": 0.7,
"validate_ratio": 0.15,
"test_ratio": 0.15,
"seed": 42
}
# Generate dataset
generator = BoLoCoGenerator(config)
dataset = generator.generate_dataset()
# Export in multiple formats
dataset.save_json("complete_dataset.json")
dataset.save_jsonl("dataset.jsonl")
dataset.save_legacy_format("./legacy/")
dataset.create_dataset_card("README.md")
# Convert to HuggingFace format (if datasets installed)
hf_dataset = dataset.to_huggingface_dataset()
if hf_dataset:
hf_dataset.save_to_disk("./hf_dataset")python3 -m boloco.cli generate \
--max-tokens 10 \ # Expression complexity (1-50)
--error-ratio 0.1 \ # Proportion of error examples (0.0-1.0)
--train-ratio 0.8 \ # Training split ratio
--validate-ratio 0.1 \ # Validation split ratio
--test-ratio 0.1 \ # Test split ratio (auto-calculated if not specified)
--seed 42 \ # Reproducibility seed
--output-dir ./data \ # Output directory
--format all \ # json|jsonl|hf|legacy|all
--name "my-dataset" \ # Dataset name
--version "1.0.0" # Dataset versionpython -m boloco.boloco \
--mode generate \ # generate|stats
--max_tokens 5 \ # Maximum tokens per expression
--error_ratio 0.05 \ # Error proportion
--dir data \ # Output directory
--train_ratio 0.7 \ # Training ratio
--validate_ratio 0.15 \ # Validation ratio
--test_ratio 0.15 \ # Test ratio
--seed 42 # Random seeddata/
├── dataset.json # Complete dataset with metadata
├── dataset_train.jsonl # Training split (JSONL)
├── dataset_validation.jsonl # Validation split
├── dataset_test.jsonl # Test split
├── README.md # Auto-generated dataset card
└── hf_dataset/ # HuggingFace format (if enabled)
├── dataset_info.json
├── train/
├── validation/
└── test/
from boloco.cli import BoLoCoGenerator
# Generate research dataset
config = {
"max_tokens": 15,
"error_ratio": 0.2,
"name": "logical-reasoning-benchmark",
"version": "1.0.0",
"description": "Boolean logic benchmark for AI reasoning"
}
generator = BoLoCoGenerator(config)
dataset = generator.generate_dataset()
# Analyze complexity distribution
stats = dataset.metadata["statistics"]
print(f"Average complexity: {stats['train']['avg_complexity']:.2f}")
print(f"Max nesting depth: {stats['train']['max_nesting_depth']}")
# Filter by complexity for progressive training
hf_dataset = dataset.to_huggingface_dataset()
if hf_dataset:
simple_examples = hf_dataset["train"].filter(
lambda x: x["metadata"]["complexity_score"] < 10
)
complex_examples = hf_dataset["train"].filter(
lambda x: x["metadata"]["complexity_score"] >= 10
)from datasets import load_dataset
from transformers import AutoTokenizer
# Load dataset
dataset = load_dataset("json", data_files="dataset.json")
# Prepare for transformer training
tokenizer = AutoTokenizer.from_pretrained("gpt2")
def prepare_examples(examples):
inputs = [f"Evaluate: {expr}" for expr in examples["expression"]]
targets = examples["evaluation"]
return tokenizer(inputs, targets, truncation=True, padding=True)
# Tokenize and prepare
tokenized_dataset = dataset.map(prepare_examples, batched=True)
# Filter by complexity for curriculum learning
easy_examples = dataset["train"].filter(
lambda x: x["metadata"]["complexity_score"] < 8
)
hard_examples = dataset["train"].filter(
lambda x: x["metadata"]["complexity_score"] >= 8
)from torch.utils.data import DataLoader
from datasets import load_dataset
import torch
dataset = load_dataset("json", data_files="dataset.json")
dataloader = DataLoader(dataset["train"], batch_size=32, shuffle=True)
for batch in dataloader:
expressions = batch["expression"]
evaluations = batch["evaluation"]
complexity_scores = batch["metadata"]["complexity_score"]
# Use complexity scores for curriculum learning
easy_mask = complexity_scores < 8
hard_mask = complexity_scores >= 8
# Train your model with progressive difficulty
# model.train_step(expressions[easy_mask], evaluations[easy_mask])The enhanced version automatically computes comprehensive statistics:
- Distribution Analysis: True/False/Error ratios across splits
- Complexity Metrics: Average complexity scores and distributions
- Operator Analysis: AND/OR/NOT usage patterns
- Structural Analysis: Nesting depth and parentheses usage
- Quality Metrics: Error rates and validation scores
Example output:
Dataset Statistics:
Train: 90 examples, avg complexity: 8.45
Validation: 18 examples, avg complexity: 8.72
Test: 22 examples, avg complexity: 8.23
Operator Distribution:
Train: AND=45, OR=38, NOT=23
Validation: AND=9, OR=8, NOT=5
Test: AND=11, OR=10, NOT=6
Run the comprehensive test suite:
# Run all tests
make test # or poetry run pytest tests/
# Run tests with verbose output
make test-verbose # or poetry run pytest tests/ -vv
# Run with coverage
make test-coverage # Generate coverage reports
# Quick demo
make demo # or poetry run boloco generate --max-tokens 3 --format jsonCurrent Test Status: 4/5 tests passing ✅
- ✅ BoLoCoExample creation
- ✅ BoLoCoDataset functionality
- ✅ CLI configuration validation
⚠️ File operations (minor issue with empty statistics display)⚠️ HuggingFace integration (requires optional dependency)
- Small datasets (max_tokens=5): ~130 expressions in <0.01s
- Medium datasets (max_tokens=10): ~1000+ expressions in <0.1s
- Large datasets (max_tokens=15): ~10000+ expressions in <1s
- Streaming JSONL: Memory-efficient for large datasets
- Lazy Loading: Only load data when needed
- Batch Processing: Efficient handling of multiple files
- Input: Legacy TXT format
- Output: JSON, JSONL, HuggingFace, Legacy TXT
- Validation: All formats supported
- Conversion: Bidirectional between all formats
We welcome contributions! The modern codebase is designed for extensibility:
- Fork the repository
- Create a feature branch
- Add your enhancements
- Test with both legacy and modern formats
- Submit a pull request
git clone https://github.com/klusai/boloco.git
cd boloco
make dev-setup # Complete setup with pre-commit hooks
# Run tests
make test # or poetry run pytest tests/
# Run quality checks
make quality # Format, lint, and type-check
# Generate sample data for testing
make demo # Quick demo
make run-cli # Full CLI demoboloco/enhanced.py- Enhanced data structures and I/Oboloco/cli.py- Enhanced CLI interfacetests/- Comprehensive test suite with pytestpyproject.toml- Poetry configuration and dependenciesMakefile- Development workflow automation
- Single CLI: One focused, enhanced interface
- Modern Formats: JSON, JSONL, and HuggingFace support
- Rich Metadata: Comprehensive analysis and statistics
- Easy Integration: Direct compatibility with ML workflows
BoLoCo uses Poetry for modern Python dependency management and pytest for testing:
make help # Show all available commands
make install # Install basic dependencies
make install-dev # Install with development tools
make test # Run test suite
make lint # Check code quality
make format # Format code
make build # Build distribution packages
make clean # Clean build artifactspoetry install # Install dependencies
poetry add <package> # Add new dependency
poetry remove <package> # Remove dependency
poetry update # Update dependencies
poetry run <command> # Run command in virtual environment
poetry shell # Activate virtual environment
poetry build # Build package
poetry publish # Publish to PyPIpoetry run black . # Format code
poetry run isort . # Sort imports
poetry run flake8 . # Lint code
poetry run mypy boloco # Type checking
poetry run pytest tests/ # Run tests- Enhanced API: See
boloco/enhanced.pyfor full API - CLI Reference:
poetry run boloco --helpfor all commands - Development:
make helpfor development workflow - Test Examples:
tests/for usage patterns - Generated Cards: Auto-created README.md files for datasets
Q: "python: command not found"
A: Use python3 instead of python
Q: "No module named 'datasets'"
A: Install with pip install datasets or use pip install -e ".[enhanced]"
Q: "Rich output not showing"
A: Install with pip install rich or use pip install -e ".[enhanced]"
- Check test suite:
make testorpoetry run pytest tests/ - Quick demo:
make demoorpoetry run boloco generate --max-tokens 3 --output-dir ./test - Review logs: Enhanced CLI provides detailed error messages
- All commands:
make helpfor available development commands
This project is licensed under the MIT License. See the LICENSE file for details.