Skip to content

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization

Notifications You must be signed in to change notification settings

Audio-WestlakeU/FN-SSL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 

Repository files navigation

Full-band and Narrow-band fusion Network for SSL

Introduction

This repository provides methods which based on full-band and narrow-band fusion network for sound source localization. The narrow-band module processes the along-time sequences to focus on learning these narrow-band spatial information. The full-band module processes the along-frequency sequence to focus on learning the full-band correlation of spatial cues, such as the linear relation of DP-IPD to frequency.

Methods

Two official implemented sound source localization methods are included:

Datasets

Quick start (will be update soon)

  • Preparation

    • Download the required dataset and organize the data according to the data_org in the data folder.
    • Generate multi-channel data, You can set data_num (in Simu.py) to control the size of the dataset. --train, -- test, --dev are used to control the generation of train dataset, test dataset, and validation dataset, respectively. The source data path of them are specified by dirs ['sousig_train '] in Opt.py.
    python Simu.py --train/--test/--dev
    
  • Training

    • We have implemented both FN-SSL and IPDnet using the Pytorch-lightning framework.
    • For Train,
    python main.py fit --data.batch_size=[*,*] --trainer.devices=*,*
    
    • For test,
    python main.py test  --ckpt_path logs/MyModel/version_x/checkpoints/**.ckpt --trainer.devices=*,*
    
  • Pretrained models

    • Using the FN_lightning model to load the lightning checkpoint in torch framework.
Framework Task Checkpoint
Lightning DP-IPD regression (FN-SSL) https://pan.baidu.com/s/1zRKpiqbSuo80Xu5ZRoS1gQ?pwd=6w51
Lightning DOA classification (FN-SSL) https://pan.baidu.com/s/1U1Wl5ZBZBItc2Vku7AyqNA?pwd=ceqm

more checkpoints will be update soon.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{wang2023fnssl,
    author = "Yabo Wang and Bing Yang and Xiaofei Li",
    title = "FN-SSL: Full-Band and Narrow-Band Fusion for Sound Source Localization",
    booktitle = "Proceedings of INTERSPEECH",
    year = "2023",
    pages = ""}

Reference code

Licence

MIT