Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 605 Bytes

README.md

File metadata and controls

20 lines (15 loc) · 605 Bytes

AudioCaps

Description

There are 4 columns in the csv file.

  • audiocap_id: The id unique to the audio clips and its corresponding caption.
  • youtube_id: The youtube clip that the audio belongs to. You can use this to obtain the VGGish embedding from AudioSet.
  • start_time: The start time of the clip.
  • caption: The audio caption.

Statistics:

Split Count
Train 49,838
Validation 495
Test 975
Total 51,308

Last edit: May 30, 2019