Skip to content
View wsstriving's full-sized avatar

Highlights

  • Pro

Block or report wsstriving

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A PyTorch native library for large model training

Python 2,919 233 Updated Jan 2, 2025

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python 63 6 Updated Dec 3, 2024

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Python 494 61 Updated Oct 26, 2024

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Python 389 39 Updated Oct 25, 2024
53 2 Updated Sep 13, 2024

Target Speaker Extraction Toolkit

Python 138 16 Updated Nov 6, 2024

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

Python 2,647 486 Updated Dec 25, 2024

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 4,661 451 Updated Dec 26, 2024

Inference and training library for high-quality TTS models.

Python 4,859 502 Updated Dec 10, 2024

Discriminative Training of VBx Diarization

Python 20 2 Updated Sep 23, 2024

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 129 6 Updated Nov 14, 2024

Multilingual Voice Understanding Model

Python 3,903 349 Updated Nov 29, 2024

Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Python 5 1 Updated Nov 5, 2024

Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information

Python 135 15 Updated Nov 27, 2023
Python 149 13 Updated Jul 9, 2024

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 18,291 2,213 Updated Nov 13, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,581 213 Updated Aug 1, 2024

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

117 2 Updated Jun 13, 2024
HTML 25 1 Updated Aug 2, 2024

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 373 42 Updated Sep 13, 2024

Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).

Python 23 4 Updated Sep 27, 2024

Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stability AI.

Python 155 11 Updated Jul 25, 2024

A generative speech model for daily dialogue.

Python 33,397 3,631 Updated Dec 3, 2024

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,752 592 Updated Jul 2, 2024

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 569 94 Updated Dec 31, 2024
Jupyter Notebook 94 11 Updated Apr 8, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 11,801 1,710 Updated Jan 2, 2025

AI-based Audio Watermarking Tool

Python 238 32 Updated Jan 7, 2024

中国科研常用LaTeX模板集

TeX 350 46 Updated Nov 18, 2024

CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence architecture

Shell 71 5 Updated May 17, 2024
Next
Showing results