VPTQ can count 'r' NOW!

We've just released the latest model, Llama-3.1-Nemotron-70B-Instruct-HF VPTQ-community, compressed from 4 bits to 1.5 bits. Everyone is welcome to try out the new model! We invite all kinds of suggestions. Now, we can count "r" on our local GPUs!

CUDA_VISIBLE_DEVICES=0 python -m vptq --model VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-65536-woft --chat
# how many r in strrrraberry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example_count_r.md

example_count_r.md

VPTQ can count 'r' NOW!

Files

example_count_r.md

Latest commit

History

example_count_r.md

File metadata and controls

VPTQ can count 'r' NOW!