Dataset

Request the BIWI dataset from Biwi 3D Audiovisual Corpus of Affective Communication. Download all the compressed files, decompress them, and put the rigid_scans, faces and videos into BIWI/.

Description

14 persons, 40 English sentences, each of length 3~8 seconds. The face meshes are captured at 25 fps and the audio is captured with a sample rate of 44.1kHz. The dataset contains the following subfolders:

'rigid_scans' contains the sequences where the speakers just rotated their head around, with a neutral expression. In this folder there are also the personalised templates (.obj) and textures (.png) files. Each face scan has 23370 vertices.
'faces' contains the tracked facial geometries. In each subject's folder, there is one .tar and one .tgz file for each sequence: the .tar archives contain a .vl file for each frame, i.e., a binary file containing the 3D coordinates of each vertex in the generic face template, after the global rotation and translation were removed.
'videos' contains videos (.flv) of the rendered 3D geometries and original audio (sampling rate: 44.1kHz).

Data Preprocessing

Here we adopt a verbose way to preprocess the data partially using Matlab scripts, you may use your own script in Python to do the same thing with following sequential steps:

cd BIWI/

Read the .vl files, normalize the coordinates and store them in .mat files (e.g., F1/vert/e01/frame_001.mat). Each .mat stores the 3D coordinates for one frame. (Optional) It will also out the .off files (e.g., F1/off/e01/frame_001.off) for debugging and visualization purpose. [Note: please define targetpath to your own path.]:

run("data_preprocess/creatDataset_BIWI.m")

Read the template .obj files, normalize the coordinates and store them in .mat/.off files. [Note: please define targetpath to your own path.]:

run("data_preprocess/process_BIWI_template.m")

Read the template .mat files, and store them in .pkl files into templates.pkl:

python data_preprocess/load_mat_to_template_pkl.py

Convert the .mat files to .npy files (e.g., F2_e01.npy). Each .npy (associated with the shape of [num_frame, 23370*3]) stores the 3D coordinates for one sentence for each person and will be saved into vertices_npy/:

python data_preprocess/load_mat_to_multiple_vert_np.py

Install ffmpeg, extract the audios from .flv video files and save them into wav/:

python data_preprocess/video2audio.py

You can obtain the preprocessed files (templates.pkl, vertices_npy/ and wav/) for training, and note that we have already provided the BIWI.ply file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dataset