Skip to content
View shigabeev's full-sized avatar
🥐
🥐

Highlights

  • Pro

Block or report shigabeev

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch implementation of Real-ESRGAN model

Python 525 150 Updated Apr 15, 2024

PyTorch video decoding

Python 176 14 Updated Jan 1, 2025

SOFA_AI: Singing-Oriented Forced Aligner for Automatic Inference

Python 20 3 Updated May 28, 2024

Simple audio AE

Python 11 Updated Nov 10, 2024

A library built for easier audio self-supervised training, downstream tasks evaluation

Python 110 10 Updated Aug 27, 2024

Textual Inversion for Stable Diffusion XL 1.0

Jupyter Notebook 74 6 Updated Jan 6, 2024

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Jupyter Notebook 8,273 877 Updated Jul 26, 2024

Parameters to analyse audio files

Python 1 1 Updated Dec 27, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,696 185 Updated Nov 14, 2024

Controllable and fast Text-to-Speech for over 7000 languages!

Python 1,504 172 Updated Nov 7, 2024

The remake of the https://github.com/biubug6/Pytorch_Retinaface

Python 394 107 Updated Jan 27, 2023

LLM101n: Let's build a Storyteller

30,796 1,682 Updated Aug 1, 2024

An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement

Python 123 6 Updated Dec 29, 2024

YSDA course in Speech Processing.

Jupyter Notebook 208 66 Updated Jul 1, 2024

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

116 2 Updated Jun 13, 2024

A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS

Python 32 1 Updated Dec 11, 2024

Framework for processing and filtering datasets

Python 26 2 Updated Aug 1, 2024
Python 9,877 1,271 Updated Jan 1, 2025
Python 44 10 Updated Apr 16, 2023

Image Scrapper (Unsplash and Pinterest)

Python 1 Updated Jul 23, 2023

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 2,898 203 Updated Nov 27, 2024

Hackers' Guide to Language Models

Jupyter Notebook 1,798 297 Updated Dec 13, 2024

Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.

Swift 4,131 258 Updated Nov 7, 2024

Runpod WhisperX Docker Container Repo

Python 12 7 Updated Mar 10, 2024

The VoxTube dataset official repository

HTML 62 1 Updated Feb 14, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 181 12 Updated Apr 20, 2024

This is my reimplementation of Tacotron2 based on nvidia implementation

Python 3 Updated Mar 28, 2024

Grok open release

Python 49,750 8,345 Updated Aug 30, 2024

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context

Python 182 11 Updated Sep 10, 2024
Next
Showing results