Skip to content
Change the repository type filter

All

    Repositories list

    • Transkun

      Public
      A simple yet effective Audio-to-Midi Automatic Piano Transcription system
      Python
      MIT License
      11000Updated Sep 28, 2024Sep 28, 2024
    • HARP

      Public
      A sample editing application allowing for hosted, asynchronous, remote processing of audio with machine learning by routing through Gradio endpoints.
      HTML
      BSD 3-Clause "New" or "Revised" License
      3000Updated Sep 17, 2024Sep 17, 2024
    • Y-vector

      Public
      Y-vector: Multiscale Waveform Encoder for Speaker Embedding
      Python
      MIT License
      9000Updated Sep 15, 2024Sep 15, 2024
    • Codebase for "Transcription free filler word detection with Neural semi-CRFs" [ICASSP2023]
      Python
      MIT License
      2000Updated Sep 15, 2024Sep 15, 2024
    • Implementation of the paper "One-class Learning towards Generalized Voice Spoofing Detection"
      Jupyter Notebook
      MIT License
      32000Updated Sep 14, 2024Sep 14, 2024
    • pyharp

      Public
      Companion repository which facilitates the creation of Gradio endpoints which are accessible from within Digital Audio Workstations (DAWs) through HARP.
      Python
      BSD 3-Clause "New" or "Revised" License
      3000Updated Sep 6, 2024Sep 6, 2024
    • SynthTab

      Public
      Official Repository for ICASSP 2024 Paper "SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription"
      Python
      Other
      3000Updated Aug 22, 2024Aug 22, 2024
    • MSOC

      Public
      Python
      1000Updated Jul 29, 2024Jul 29, 2024
    • BeatNet

      Public
      BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).
      Python
      Creative Commons Attribution 4.0 International
      55000Updated May 29, 2024May 29, 2024
    • Cacophony

      Public
      Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
      Python
      MIT License
      4000Updated Apr 26, 2024Apr 26, 2024
    • lhvqt

      Public
      Frontend filterbank learning module with HVQT initialization capabilities.
      Python
      MIT License
      3000Updated Feb 27, 2024Feb 27, 2024
    • Invited talk at group meeting of AIR lab
      0000Updated Dec 5, 2023Dec 5, 2023
    • Official Implementation of our WASPAA 2023 paper "Mitigating Cross-Database Differences for Learning Unified HRTF Representation"
      Python
      BSD 3-Clause "New" or "Revised" License
      3000Updated Dec 3, 2023Dec 3, 2023
    • Official implementation of the ICASSP 2023 paper "HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields"
      Python
      MIT License
      2000Updated Dec 3, 2023Dec 3, 2023
    • This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
      Python
      Other
      17000Updated Nov 29, 2023Nov 29, 2023
    • This repository contains the implementation of an efficient joint beat, downbeat, tempo, and meter tracking system using a compact 1D probabilistic state space and a jump-back reward technique. ICASSP 2022.
      Python
      MIT License
      13000Updated Nov 28, 2023Nov 28, 2023
    • Official Implementation of our ICASSP 2024 paper "Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech"
      HTML
      2000Updated Nov 25, 2023Nov 25, 2023
    • Official implementation of the handbook chapter "Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation"
      Python
      MIT License
      1000Updated Oct 2, 2023Oct 2, 2023
    • amt-tools

      Public
      Machine learning tools and framework for automatic music transcription.
      Python
      MIT License
      4000Updated Jul 30, 2023Jul 30, 2023
    • harana

      Public
      A neural semi-CRF model for harmonic analysis
      Python
      MIT License
      1000Updated Jul 14, 2023Jul 14, 2023
    • The code for the TMM paper "Speech Driven Talking Face Generation from a Single Image and an Emotion Condition"
      Python
      MIT License
      31000Updated Apr 9, 2023Apr 9, 2023
    • samo

      Public
      Official Implementation of our ICASSP 2023 paper "SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING"
      Python
      MIT License
      9000Updated Apr 5, 2023Apr 5, 2023
    • Code for the paper "A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems".
      Python
      MIT License
      1000Updated Dec 14, 2022Dec 14, 2022
    • Open source code for the paper 'Music Source Separation with Generative Flow'
      Jupyter Notebook
      MIT License
      1100Updated Nov 18, 2022Nov 18, 2022
    • Code for the paper "Draw and Listen! A Sketch-based System for Music Inpainting", TISMIR 2022
      Python
      MIT License
      2000Updated Nov 4, 2022Nov 4, 2022
    • This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations.
      Python
      MIT License
      4000Updated Sep 4, 2022Sep 4, 2022
    • BachDuet

      Public
      BachDuet enables a human performer to improvise a duet counterpoint with a computer agent in real time.
      Python
      2000Updated Aug 8, 2022Aug 8, 2022
    • DyViSE

      Public
      Official implementation of our MMSP 2022 paper, "Dynamic vision-guided speaker embedding for audio-visual speaker diarization"
      Python
      2000Updated Jul 5, 2022Jul 5, 2022
    • SASV_PR

      Public
      Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"
      Python
      MIT License
      5000Updated Jun 24, 2022Jun 24, 2022
    • Code for the paper "Learning Sparse Analytic Filters for Piano Transcription".
      Python
      MIT License
      3000Updated Jun 22, 2022Jun 22, 2022