Skip to content

kitamoto-lab/digital-typhoon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 

Repository files navigation

Digital Typhoon Dataset

This dataset is created by the Digital Typhoon project.

Overview of the Digital Typhoon dataset

Overview

This page summarizes information about the Digital Typhoon Dataset, the longest typhoon satellite image dataset for 40+ years, aimed at benchmarking machine learning models for long-term spatio-temporal data. To build the dataset, we developed a workflow to create a typhoon-centered image by cropping the original satellite image using Lambert azimuthal equal-area projection centered at the location of the best track data. We also address data quality issues such as inter-satellite calibration to create a long-term homogeneous dataset. To take advantage of the dataset, we proposed machine learning tasks by the types and targets of inference, with other tasks for meteorological analysis, societal impact, and climate change.

Paper

The following paper introduces the Dataset V2.

Asanobu KITAMOTO, Erwan Dzik, Gaspar Faure, "Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks", arXiv, doi:arXiv.2411.16421, 2024.

The following paper introduces the Dataset V1.

Asanobu Kitamoto, Jared Hwang, Bastien Vuillod, Lucas Gautier, Yingtao Tian, Tarin Clanuwat, "Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones", NeurIPS 2023 Datasets and Benchmarks (Spotlight), 2023.

@InProceedings{ neurips23,
      author = {Asanobu Kitamoto and Jared Hwang and Bastien Vuillod and Lucas Gautier and Yingtao Tian and Tarin Clanuwat},
      title = {Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones},
      booktitle = {{NeurIPS} 2023 Datasets and Benchmarks (Spotlight)},
      year = 2023,
      month = 12,
}

You can also find the paper on arXiv Asanobu Kitamoto, Jared Hwang, Bastien Vuillod, Lucas Gautier, Yingtao Tian, Tarin Clanuwat, "Digital Typhoon: Long-term Satellite Image Dataset for the Spatio-Temporal Modeling of Tropical Cyclones", arXiv:2311.02665, 2023.

Dataset

Digital Typhoon Dataset is a satellite image dataset designed for machine learning research on tropical cyclones. The Dataset V2, released on November 26, 2024, comprises the WP dataset from the northern hemisphere, and the AU dataset from the southern hemisphere.

Basin Western Pacific (WP) Around Australia (AU)
Season 1978-2023 (1978, 1979, 1980 are not complete) 1979-2024 (some years are not complete)
Tropical cyclones 1,116 480
Images 192,956 70,087
Dataset Size 56GB 21GB

The dataset is provided under a CC BY 4.0 international license, with attribution as follows.

Digital Typhoon Dataset V2 (National Institute of Informatics), doi:10.20783/DIAS.664

Software

pyphoon2 is a machine-learning library for the Digital Typhoon Dataset. The documentation is available at pyphoon2’s documentation at readthedocs.

Model

Hugging Face provides model weights for machine learning tasks and some codes for using them.

Data Repository

Digital Typhoon Dataset is also available from DIAS (Data Integration and Analysis System). DIAS is a Japanese data repository for earth science and environmental datasets and offers the dataset DOI (Digital Object Identifier) 10.20783/DIAS.664 as a persistent identifier.

Website

Digital Typhoon is one of Japan's most popular and largest typhoon information websites. The annual page view is around 20 million, and many people visit the website to check the latest information and study historical data.

Releases

No releases published

Packages

No packages published