This repository contains the code and analysis pipelines associated with the paper:
Jingjing Tang, Aaron Rumack, Bryan Wilder, and Roni Rosenfeld
Real-time Forecasting of Data Revisions in Epidemic Surveillance Streams
medRxiv preprint, MEDRXIV/2025/327058
Epidemic surveillance data frequently undergo revisions due to reporting delays and other operational factors. These revisions can degrade situational awareness and the accuracy of forecasts. In this study, we introduce Delphi-RF (GitHub link), a real-time data revision forecasting framework using nonparametric quantile regression. Delphi-RF is applicable to both count- and fraction-type public health surveillance data.
- Models revision dynamics by incorporating all available data up to a given estimation date.
- Produces distributional forecasts for finalized surveillance values.
- Applicable to both count-type (e.g., case counts) and fraction-type (e.g., test positivity rates) signals.
- Supports both daily and weekly data streams.
- Evaluated on:
- Daily COVID-19 insurance claims, antigen test results, and confirmed cases
- Weekly dengue and influenza-like illness (ILI) case counts
- Achieves a 10–100× improvement in computational efficiency over existing methods such as NobBS and Epinowcast.
code/
generate_experimental_results/ # Scripts for generating revision forecasts using Delphi-RF, NobBS, or Epinowcast
generate_figs/ # Scripts for generating figures included in the paper
data/
raw/ # Raw and preprocessed datasets used in the study
results/ # Forecasts generated by different methods
- Python 3.7+
- R 4.0+
If you find this code or the methods valuable for your research, please cite the paper:
Tang J, Rumack A, Wilder B, Rosenfeld R. Real-time Forecasting of Data Revisions in Epidemic Surveillance Streams. medRxiv preprint, MEDRXIV/2025/327058.