SqueezeBits Inc.
- 24 followers
- Korea, South
- https://squeezebits.com/
- info@squeezebits.com
Popular repositories Loading
-
owlite-examples
owlite-examples PublicOwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.
-
-
-
vllm-fork
vllm-fork PublicForked from HabanaAI/vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
Repositories
- vllm-fork Public Forked from HabanaAI/vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
SqueezeBits/vllm-fork’s past year of commit activity - gradio Public Forked from gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
SqueezeBits/gradio’s past year of commit activity - TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
SqueezeBits/TensorRT-LLM’s past year of commit activity - owlite-examples Public
OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.
SqueezeBits/owlite-examples’s past year of commit activity - nvidia-dind Public Forked from ehfd/nvidia-dind
Isolated DinD (Docker in Docker) container for developing and deploying Docker containers using NVIDIA GPUs and the NVIDIA container toolkit. Useful for deploying the Docker engine with NVIDIA in Kubernetes.
SqueezeBits/nvidia-dind’s past year of commit activity