Skip to content

rashchedrin/is_aleatory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Aleatory vs Epistemic Uncertainty in Prediction Markets

A statistical analysis framework for testing whether prediction market probabilities represent aleatory (objective, random) uncertainty or epistemic (subjective, knowledge-based) uncertainty.

Overview

This project implements rigorous statistical tests to determine whether prediction markets exhibit aleatory uncertainty - the property that outcomes are generated according to the predicted probability distributions. We analyze data from four major prediction market platforms using Monte Carlo simulations with directional statistical tests.

Key Findings

πŸ” None of the major prediction markets exhibit true aleatory uncertainty

  • Manifold Markets: Systematically overconfident (predictions too extreme)
  • Kalshi, Metaculus, Polymarket: Systematically underconfident (predictions too conservative)

See FINAL_REPORT.md for detailed analysis and implications.

Installation

Prerequisites

  • Python 3.8+
  • Virtual environment (recommended)

Setup

# Clone repository
git clone https://github.com/arqwer/aleatory_or_epistemic.git
cd aleatory_or_epistemic

# Activate your virtual environment
source ~/work/viewapp/uvenv_viewapp/bin/activate

# Install dependencies
pip install numpy requests

Usage

Quick Start: Analyze All Platforms

python analyze_all_platforms.py

This will:

  1. Fetch true midpoint probabilities from the Themis database
  2. Run 50,000 Monte Carlo simulations per test statistic
  3. Generate comprehensive reports for each platform
  4. Display cross-platform comparison

Analyze Specific Platform

from prediction_markets_analysis import report_is_aleatory
import numpy as np

# Your prediction data
predictions = np.array([0.3, 0.7, 0.2, 0.8, 0.5])
outcomes = np.array([0, 1, 0, 1, 0])

# Run analysis
report_is_aleatory(predictions, outcomes, n_simulations=50000, platform_name="My Platform")

Validate Statistical Tests

Test the framework using synthetic datasets:

python validate_statistical_tests.py

This creates four synthetic datasets (aleatory, underconfident, overconfident, mildly underconfident) and verifies that the statistical tests correctly identify each pattern.

Project Structure

aleatory_or_epistemic/
β”œβ”€β”€ README.md                           # This file
β”œβ”€β”€ FINAL_REPORT.md                     # Comprehensive analysis report
β”œβ”€β”€ is_aleatory.ipynb                   # Original methodology notebook
β”œβ”€β”€ prediction_markets_analysis.py      # Core analysis framework
β”œβ”€β”€ analyze_all_platforms.py            # Multi-platform analysis script
β”œβ”€β”€ validate_statistical_tests.py       # Statistical test validation
β”œβ”€β”€ themis_*_cache.json                 # Cached market data
└── all_platforms_results.txt           # Latest analysis output

Core Concepts

Aleatory vs Epistemic Uncertainty

  • Aleatory Uncertainty: Objective randomness (e.g., coin flips, quantum events)
  • Epistemic Uncertainty: Subjective lack of knowledge (e.g., forecasting elections)

This project tests whether prediction market probabilities behave like aleatory probabilities.

Statistical Tests

Overconfidence Tests

Detect when predictions are too extreme (too far from 0.5):

  • Expected Calibration Error (ECE)
  • Cross-entropy
  • Brier Score
  • MSE from Calibration Diagonal
  • Base Rate Deviation

Underconfidence Tests

Detect when predictions are too conservative (too close to 0.5):

  • Negative Cross-entropy
  • Negative Brier Score

Methodology

  1. Monte Carlo Simulation: Generate synthetic outcomes from predicted probabilities
  2. Test Statistic Calculation: Compute calibration metrics on real and simulated data
  3. P-value Estimation: Proportion of simulations where synthetic data shows worse calibration than real data
  4. Bonferroni Correction: Adjust for multiple hypothesis testing

Data Source

All prediction market data comes from Themis (https://brier.fyi), a comprehensive archive of prediction market history maintained by @wasabipesto.

We specifically use true temporal midpoint probabilities - the market probability at the exact temporal midpoint of each market's duration - to ensure we're testing genuine forecasts rather than near-resolution prices.

Results Summary

Platform Markets Pattern Cross-entropy p Brier p Conclusion
Manifold 10,837 Overconfident 0.00000 *** 0.00000 *** Too extreme
Kalshi 10,000 Underconfident 0.99998 0.99996 Too conservative
Metaculus 4,723 Underconfident 1.00000 1.00000 Too conservative
Polymarket 10,000 Underconfident 0.99998 1.00000 Too conservative

p-values for overconfidence tests (low p-value = overconfident)

Code Quality Standards

This project follows strict coding standards:

  • βœ… All functions have type hints
  • βœ… All array shapes documented in docstrings
  • βœ… Comprehensive docstrings with side-effects noted
  • βœ… Input validation with descriptive assertions
  • βœ… Semantic variable names (no single-letter variables)
  • βœ… Fail-fast error handling

Examples

Example 1: Perfectly Aleatory Data

# Generate truly aleatory data
true_probs = np.random.uniform(0, 1, 1000)
outcomes = (np.random.random(1000) < true_probs).astype(int)

report_is_aleatory(true_probs, outcomes, 10000, "Aleatory Test")
# Expected: All p-values > 0.05 (accept aleatory hypothesis)

Example 2: Overconfident Predictions

# Shift probabilities away from 0.5
true_probs = np.random.uniform(0, 1, 1000)
overconfident_preds = 0.5 + 1.5 * (true_probs - 0.5)  # Amplify distance from 0.5
overconfident_preds = np.clip(overconfident_preds, 0, 1)
outcomes = (np.random.random(1000) < true_probs).astype(int)

report_is_aleatory(overconfident_preds, outcomes, 10000, "Overconfident Test")
# Expected: Low p-values on overconfidence tests

Example 3: Underconfident Predictions

# Shift probabilities toward 0.5
true_probs = np.random.uniform(0, 1, 1000)
underconfident_preds = 0.5 + 0.5 * (true_probs - 0.5)  # Reduce distance from 0.5
outcomes = (np.random.random(1000) < true_probs).astype(int)

report_is_aleatory(underconfident_preds, outcomes, 10000, "Underconfident Test")
# Expected: Low p-values on underconfidence tests, high on overconfidence tests

Contributing

This is a research project. Contributions welcome for:

  • Additional statistical tests
  • New data sources
  • Performance optimizations
  • Documentation improvements

Citation

If you use this code or methodology in your research, please cite:

@software{aleatory_prediction_markets_2025,
  author = {arqwer},
  title = {Aleatory vs Epistemic Uncertainty in Prediction Markets},
  year = {2025},
  url = {https://github.com/arqwer/aleatory_or_epistemic}
}

Related Work

  • Themis Database: https://github.com/wasabipesto/themis - Comprehensive prediction market archive
  • Calibration Research: Brier Score, Expected Calibration Error, proper scoring rules
  • Prediction Market Theory: Information aggregation, market microstructure

License

MIT License - See LICENSE file for details

Contact

For questions or collaboration opportunities, open an issue on GitHub.

Acknowledgments

  • @wasabipesto for creating and maintaining the Themis database
  • Prediction market platforms: Manifold Markets, Kalshi, Metaculus, Polymarket
  • The forecasting and prediction market communities

Last Updated: October 10, 2025
Status: βœ… Analysis Complete

About

Test

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published