Skip to content
View mfuntowicz's full-sized avatar

Organizations

@huggingface

Block or report mfuntowicz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…

Python 761 56 Updated Mar 3, 2025

A simple, performant and scalable Jax LLM!

Python 1,638 326 Updated Mar 5, 2025

A pytorch quantization backend for optimum

Python 892 70 Updated Mar 5, 2025

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,013 671 Updated Mar 5, 2025

Transformer related optimization, including BERT, GPT

C++ 6,062 900 Updated Mar 27, 2024

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,452 537 Updated Mar 5, 2025
MLIR 407 72 Updated Mar 5, 2025

Backward compatible ML compute opset inspired by HLO/MHLO

MLIR 450 127 Updated Mar 4, 2025

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Python 173 234 Updated Mar 5, 2025

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 32,942 3,053 Updated Mar 5, 2025

State-of-the-Art Text Embeddings

Python 16,134 2,558 Updated Mar 5, 2025

Blazing fast training of 🤗 Transformers on Graphcore IPUs

Python 85 35 Updated Mar 11, 2024

a debugger for async rust!

Rust 3,806 149 Updated Jan 22, 2025

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,241 396 Updated Mar 5, 2025

Hydra is a framework for elegantly configuring complex applications

Python 9,098 663 Updated Mar 5, 2025

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,341 263 Updated Mar 5, 2025

Dapr user documentation, used to build docs.dapr.io

HTML 998 735 Updated Mar 5, 2025

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,296 186 Updated Feb 7, 2024

Simple Python client for the Hugging Face Inference API

Python 72 10 Updated Aug 18, 2020

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

TypeScript 3,523 356 Updated Feb 20, 2025

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++ 7,915 2,450 Updated Mar 5, 2025

Cross-platform CLI and Python drivers for AIO liquid coolers and other devices

Python 2,298 229 Updated Mar 5, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,211 4,278 Updated Mar 5, 2025

Visualizer for neural network, deep learning and machine learning models

JavaScript 29,553 2,859 Updated Mar 5, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 15,859 3,077 Updated Mar 5, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,844 1,530 Updated Mar 5, 2025

Open standard for machine learning interoperability

Python 18,547 3,721 Updated Mar 5, 2025
Next
Showing results