Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL
A simple screen parsing tool towards pure vision based GUI agent
LLM based autonomous agent that conducts local and web research on any topic and generates a comprehensive report with citations.
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
This repository contains the experimental PyTorch native float8 training UX
fanshiqing / grouped_gemm
Forked from tgale96/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
Several simple examples for popular neural network toolkits calling custom CUDA operators.
A PyTorch native library for large model training
Scalable toolkit for efficient model alignment
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Sample codes for my CUDA programming book
A validation and profiling tool for AI infrastructure
FlagPerf is an open-source software platform for benchmarking AI chips.
AI driven development in your terminal. Designed for large, real-world tasks.