Skip to content

Python library for downloading, loading & working with sound datasets

License

Notifications You must be signed in to change notification settings

harshpalan/soundata

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

soundata

Python library for downloading, loading & working with sound datasets. Find the API documentation here.
Inspired by and based on mirdata. (https://github.com/soundata/soundata)

CircleCI codecov Documentation Status GitHub

This library provides tools for working with common sound datasets, including tools for:

  • Downloading datasets to a common location and format
  • Validating that the files for a dataset are all present
  • Loading annotation files to a common format
  • Parsing clip-level metadata for detailed evaluations

Here's soundata's list of currently supported datasets.

Installation

To install, simply run:

pip install soundata

Quick example

import soundata

dataset = soundata.initialize('urbansound8k')
dataset.download()  # download the dataset
dataset.validate()  # validate that all the expected files are there

example_clip = dataset.choice_clip()  # choose a random example clip
print(example_clip)  # see the available data

See the documentation for more examples and the API reference.

Citing

@misc{fuentes_salamon2021soundata,
      title={Soundata: A Python library for reproducible use of audio datasets}, 
      author={Magdalena Fuentes and Justin Salamon and Pablo Zinemanas and Martín Rocamora and 
      Genís Plaja and Irán R. Román and Marius Miron and Xavier Serra and Juan Pablo Bello},
      year={2021},
      eprint={2109.12690},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

When working with datasets, please cite the version of soundata that you are using AND include the reference of the dataset, which can be found in the respective dataset loader using the cite() method.

Contributing a new dataset loader

We welcome and encourage contributions to this library, especially new datasets. Please see contributing for guidelines.

About

Python library for downloading, loading & working with sound datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%