v1.7.0
Overview
- Fix crashes with high number of beams
- Reduce overal VRAM usage
- Optimize Encoder performance
Some performance numbers for this release:
M2 Ultra
Flash Attention ON:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
M2 Ultra | METAL | tiny | 1 | 1 | 8.37 | 1.44 | 0.48 | 0.01 | 6a94163 |
M2 Ultra | METAL | tiny-q5_0 | 1 | 1 | 9.81 | 1.46 | 0.50 | 0.01 | 6a94163 |
M2 Ultra | METAL | tiny-q5_1 | 1 | 1 | 8.80 | 1.47 | 0.50 | 0.01 | 6a94163 |
M2 Ultra | METAL | base | 1 | 1 | 16.11 | 1.96 | 0.74 | 0.02 | 6a94163 |
M2 Ultra | METAL | base-q5_0 | 1 | 1 | 16.38 | 1.99 | 0.78 | 0.02 | 6a94163 |
M2 Ultra | METAL | base-q5_1 | 1 | 1 | 16.72 | 2.00 | 0.77 | 0.02 | 6a94163 |
M2 Ultra | METAL | small | 1 | 1 | 41.26 | 3.88 | 1.66 | 0.05 | 6a94163 |
M2 Ultra | METAL | small-q5_0 | 1 | 1 | 46.91 | 4.02 | 1.76 | 0.06 | 6a94163 |
M2 Ultra | METAL | small-q5_1 | 1 | 1 | 47.05 | 4.00 | 1.73 | 0.06 | 6a94163 |
M2 Ultra | METAL | medium | 1 | 1 | 111.29 | 7.79 | 3.63 | 0.11 | 6a94163 |
M2 Ultra | METAL | medium-q5_0 | 1 | 1 | 129.78 | 7.71 | 3.85 | 0.13 | 6a94163 |
M2 Ultra | METAL | medium-q5_1 | 1 | 1 | 129.29 | 7.71 | 3.87 | 0.13 | 6a94163 |
M2 Ultra | METAL | medium-dis | 1 | 1 | 99.27 | 1.09 | 0.43 | 0.02 | 6a94163 |
M2 Ultra | METAL | large-v2 | 1 | 1 | 198.81 | 11.54 | 5.59 | 0.20 | 6a94163 |
M2 Ultra | METAL | large-v2-q5_0 | 1 | 1 | 236.18 | 11.12 | 6.11 | 0.24 | 6a94163 |
M2 Ultra | METAL | large-v2-q5_1 | 1 | 1 | 235.88 | 11.14 | 6.01 | 0.24 | 6a94163 |
M2 Ultra | METAL | large-v2-dis | 1 | 1 | 177.41 | 1.21 | 0.48 | 0.02 | 6a94163 |
M2 Ultra | METAL | large-v3-turbo | 1 | 1 | 178.92 | 1.89 | 0.83 | 0.03 | 6a94163 |
M2 Ultra | METAL | large-v3-turbo-q5_0 | 1 | 1 | 211.44 | 1.73 | 0.90 | 0.04 | 6a94163 |
Flash Attention OFF:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
M2 Ultra | METAL | tiny | 1 | 0 | 10.04 | 1.37 | 0.50 | 0.01 | 6a94163 |
M2 Ultra | METAL | tiny-q5_0 | 1 | 0 | 10.02 | 1.36 | 0.53 | 0.01 | 6a94163 |
M2 Ultra | METAL | tiny-q5_1 | 1 | 0 | 11.08 | 1.37 | 0.53 | 0.01 | 6a94163 |
M2 Ultra | METAL | base | 1 | 0 | 17.84 | 1.93 | 0.77 | 0.02 | 6a94163 |
M2 Ultra | METAL | base-q5_0 | 1 | 0 | 18.57 | 1.92 | 0.81 | 0.02 | 6a94163 |
M2 Ultra | METAL | base-q5_1 | 1 | 0 | 18.66 | 1.93 | 0.82 | 0.02 | 6a94163 |
M2 Ultra | METAL | small | 1 | 0 | 48.26 | 3.95 | 1.73 | 0.05 | 6a94163 |
M2 Ultra | METAL | small-q5_0 | 1 | 0 | 53.68 | 3.99 | 1.85 | 0.06 | 6a94163 |
M2 Ultra | METAL | small-q5_1 | 1 | 0 | 53.86 | 4.00 | 1.82 | 0.06 | 6a94163 |
M2 Ultra | METAL | medium | 1 | 0 | 130.09 | 8.01 | 3.82 | 0.13 | 6a94163 |
M2 Ultra | METAL | medium-q5_0 | 1 | 0 | 148.18 | 7.92 | 4.11 | 0.14 | 6a94163 |
M2 Ultra | METAL | medium-q5_1 | 1 | 0 | 147.95 | 7.94 | 4.11 | 0.14 | 6a94163 |
M2 Ultra | METAL | medium-dis | 1 | 0 | 116.97 | 1.11 | 0.42 | 0.02 | 6a94163 |
M2 Ultra | METAL | large-v2 | 1 | 0 | 232.43 | 12.34 | 5.87 | 0.22 | 6a94163 |
M2 Ultra | METAL | large-v2-q5_0 | 1 | 0 | 269.72 | 11.68 | 6.44 | 0.26 | 6a94163 |
M2 Ultra | METAL | large-v2-q5_1 | 1 | 0 | 269.71 | 11.82 | 6.36 | 0.26 | 6a94163 |
M2 Ultra | METAL | large-v2-dis | 1 | 0 | 209.25 | 1.25 | 0.48 | 0.02 | 6a94163 |
M2 Ultra | METAL | large-v3-turbo | 1 | 0 | 211.09 | 1.98 | 0.84 | 0.03 | 6a94163 |
M2 Ultra | METAL | large-v3-turbo-q5_0 | 1 | 0 | 244.23 | 1.81 | 0.92 | 0.04 | 6a94163 |
Ryzen 9 5950X + RTX 2060
Flash Attention ON:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
RTX 2060 | AVX2 CUDA | tiny | 1 | 1 | 7.35 | 0.78 | 0.24 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | tiny-q5_0 | 1 | 1 | 6.45 | 0.67 | 0.14 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | tiny-q5_1 | 1 | 1 | 6.39 | 0.66 | 0.14 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | base | 1 | 1 | 10.20 | 0.88 | 0.30 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | base-q5_0 | 1 | 1 | 11.38 | 0.92 | 0.21 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | base-q5_1 | 1 | 1 | 11.76 | 0.91 | 0.20 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | small | 1 | 1 | 33.06 | 2.00 | 0.56 | 0.03 | 6a94163 |
RTX 2060 | AVX2 CUDA | small-q5_0 | 1 | 1 | 35.84 | 1.84 | 0.43 | 0.04 | 6a94163 |
RTX 2060 | AVX2 CUDA | small-q5_1 | 1 | 1 | 36.89 | 1.82 | 0.42 | 0.04 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium | 1 | 1 | 90.65 | 4.54 | 1.13 | 0.08 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-q5_0 | 1 | 1 | 104.01 | 3.80 | 0.91 | 0.10 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-q5_1 | 1 | 1 | 107.98 | 3.72 | 0.87 | 0.10 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-dis | 1 | 1 | 79.08 | 0.68 | 0.17 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2 | 1 | 1 | 162.00 | 7.52 | 1.92 | 0.14 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-q5_0 | 1 | 1 | 184.59 | 5.64 | 1.50 | 0.16 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-q5_1 | 1 | 1 | 193.85 | 5.55 | 1.44 | 0.17 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-dis | 1 | 1 | 140.75 | 0.84 | 0.37 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v3-turbo | 1 | 1 | 143.38 | 1.29 | 0.36 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v3-turbo-q5_0 | 1 | 1 | 163.30 | 0.93 | 0.28 | 0.03 | 6a94163 |
Flash Attention OFF:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
RTX 2060 | AVX2 CUDA | tiny | 1 | 0 | 12.49 | 0.87 | 0.23 | 0.01 | 6a94163 |
RTX 2060 | AVX2 CUDA | tiny-q5_0 | 1 | 0 | 10.65 | 0.78 | 0.19 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | tiny-q5_1 | 1 | 0 | 10.82 | 0.77 | 0.19 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | base | 1 | 0 | 18.97 | 1.04 | 0.34 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | base-q5_0 | 1 | 0 | 20.22 | 1.09 | 0.27 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | base-q5_1 | 1 | 0 | 20.48 | 1.07 | 0.27 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | small | 1 | 0 | 59.52 | 2.37 | 0.70 | 0.05 | 6a94163 |
RTX 2060 | AVX2 CUDA | small-q5_0 | 1 | 0 | 62.98 | 2.23 | 0.60 | 0.06 | 6a94163 |
RTX 2060 | AVX2 CUDA | small-q5_1 | 1 | 0 | 63.64 | 2.21 | 0.59 | 0.06 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium | 1 | 0 | 161.53 | 5.36 | 1.53 | 0.13 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-q5_0 | 1 | 0 | 174.96 | 4.64 | 1.32 | 0.15 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-q5_1 | 1 | 0 | 178.42 | 4.57 | 1.29 | 0.15 | 6a94163 |
RTX 2060 | AVX2 CUDA | medium-dis | 1 | 0 | 149.65 | 0.75 | 0.20 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2 | 1 | 0 | 280.55 | 8.74 | 2.51 | 0.23 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-q5_0 | 1 | 0 | 306.87 | 6.92 | 2.08 | 0.25 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-q5_1 | 1 | 0 | 314.25 | 6.82 | 2.02 | 0.26 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v2-dis | 1 | 0 | 259.39 | 0.91 | 0.37 | 0.02 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v3-turbo | 1 | 0 | 261.83 | 1.44 | 0.41 | 0.04 | 6a94163 |
RTX 2060 | AVX2 CUDA | large-v3-turbo-q5_0 | 1 | 0 | 282.99 | 1.09 | 0.33 | 0.04 | 6a94163 |
Vulkan:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
RTX 2060 | VULKAN | tiny | 1 | 0 | 30.38 | 1.37 | 1.04 | 0.05 | 9f346d0 |
RTX 2060 | VULKAN | tiny-q5_0 | 1 | 0 | 20.98 | 1.38 | 0.99 | 0.05 | 9f346d0 |
RTX 2060 | VULKAN | tiny-q5_1 | 1 | 0 | 20.74 | 1.30 | 0.96 | 0.05 | 9f346d0 |
RTX 2060 | VULKAN | base | 1 | 0 | 44.69 | 1.59 | 1.78 | 0.09 | 9f346d0 |
RTX 2060 | VULKAN | base-q5_0 | 1 | 0 | 39.72 | 2.11 | 1.72 | 0.08 | 9f346d0 |
RTX 2060 | VULKAN | base-q5_1 | 1 | 0 | 39.45 | 2.01 | 1.63 | 0.08 | 9f346d0 |
RTX 2060 | VULKAN | small | 1 | 0 | 160.02 | 3.53 | 4.64 | 0.23 | 9f346d0 |
RTX 2060 | VULKAN | small-q5_0 | 1 | 0 | 141.52 | 4.54 | 4.44 | 0.20 | 9f346d0 |
RTX 2060 | VULKAN | small-q5_1 | 1 | 0 | 141.03 | 4.63 | 4.18 | 0.20 | 9f346d0 |
RTX 2060 | VULKAN | medium | 1 | 0 | 472.66 | 7.55 | 11.35 | 0.56 | 9f346d0 |
RTX 2060 | VULKAN | medium-q5_0 | 1 | 0 | 395.55 | 9.81 | 10.64 | 0.49 | 9f346d0 |
RTX 2060 | VULKAN | medium-q5_1 | 1 | 0 | 398.85 | 10.16 | 10.15 | 0.50 | 9f346d0 |
RTX 2060 | VULKAN | medium-dis | 1 | 0 | 427.26 | 1.26 | 1.20 | 0.08 | 9f346d0 |
RTX 2060 | VULKAN | large-v2 | 1 | 0 | 924.60 | 12.36 | 18.56 | 1.01 | 9f346d0 |
RTX 2060 | VULKAN | large-v2-q5_0 | 1 | 0 | 774.21 | 17.25 | 17.17 | 0.85 | 9f346d0 |
RTX 2060 | VULKAN | large-v2-q5_1 | 1 | 0 | 779.75 | 17.44 | 16.27 | 0.85 | 9f346d0 |
RTX 2060 | VULKAN | large-v2-dis | 1 | 0 | 833.35 | 1.38 | 1.56 | 0.10 | 9f346d0 |
RTX 2060 | VULKAN | large-v3-turbo | 1 | 0 | 839.90 | 2.11 | 2.70 | 0.16 | 9f346d0 |
RTX 2060 | VULKAN | large-v3-turbo-q5_0 | 1 | 0 | 705.49 | 3.22 | 2.53 | 0.14 | 9f346d0 |
CPU only:
CPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
Ryzen 9 5950X | AVX2 | tiny | 16 | 0 | 137.31 | 1.38 | 0.37 | 0.20 | 6a94163 |
Ryzen 9 5950X | AVX2 | tiny-q5_0 | 16 | 0 | 143.29 | 0.54 | 0.25 | 0.19 | 6a94163 |
Ryzen 9 5950X | AVX2 | tiny-q5_1 | 16 | 0 | 144.11 | 0.58 | 0.27 | 0.20 | 6a94163 |
Ryzen 9 5950X | AVX2 | base | 16 | 0 | 293.81 | 3.15 | 0.80 | 0.33 | 6a94163 |
Ryzen 9 5950X | AVX2 | base-q5_0 | 16 | 0 | 311.95 | 1.18 | 0.45 | 0.32 | 6a94163 |
Ryzen 9 5950X | AVX2 | base-q5_1 | 16 | 0 | 319.06 | 1.26 | 0.49 | 0.34 | 6a94163 |
Ryzen 9 5950X | AVX2 | small | 16 | 0 | 1005.64 | 11.78 | 2.79 | 0.88 | 6a94163 |
Ryzen 9 5950X | AVX2 | small-q5_0 | 16 | 0 | 1110.41 | 5.44 | 1.53 | 0.91 | 6a94163 |
Ryzen 9 5950X | AVX2 | small-q5_1 | 16 | 0 | 1159.07 | 5.72 | 1.66 | 0.94 | 6a94163 |
Ryzen 9 5950X | AVX2 | medium | 16 | 0 | 3004.36 | 36.61 | 8.21 | 2.32 | 6a94163 |
Ryzen 9 5950X | AVX2 | medium-q5_0 | 16 | 0 | 3441.00 | 17.69 | 4.67 | 2.52 | 6a94163 |
Ryzen 9 5950X | AVX2 | medium-q5_1 | 16 | 0 | 3588.38 | 18.61 | 4.93 | 2.63 | 6a94163 |
Ryzen 9 5950X | AVX2 | medium-dis | 16 | 0 | 2805.43 | 4.94 | 1.12 | 0.39 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v2 | 16 | 0 | 5630.44 | 70.50 | 15.52 | 4.16 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v2-q5_0 | 16 | 0 | 6488.80 | 35.07 | 8.61 | 4.64 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v2-q5_1 | 16 | 0 | 6775.80 | 36.27 | 8.92 | 4.85 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v2-dis | 16 | 0 | 5262.10 | 7.27 | 1.60 | 0.52 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v3-turbo | 16 | 0 | 5302.64 | 11.52 | 2.55 | 0.76 | 6a94163 |
Ryzen 9 5950X | AVX2 | large-v3-turbo-q5_0 | 16 | 0 | 5984.73 | 4.26 | 1.16 | 0.80 | 6a94163 |
V100
Flash Attention ON:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
V100 | AVX2 CUDA | tiny | 1 | 1 | 4.10 | 0.96 | 0.27 | 0.01 | 6a94163 |
V100 | AVX2 CUDA | tiny-q5_1 | 1 | 1 | 4.32 | 1.01 | 0.21 | 0.01 | 6a94163 |
V100 | AVX2 CUDA | base | 1 | 1 | 7.23 | 1.30 | 0.35 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | base-q5_1 | 1 | 1 | 7.51 | 1.32 | 0.27 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | small | 1 | 1 | 19.44 | 2.59 | 0.73 | 0.03 | 6a94163 |
V100 | AVX2 CUDA | small-q5_1 | 1 | 1 | 21.46 | 2.61 | 0.54 | 0.03 | 6a94163 |
V100 | AVX2 CUDA | medium | 1 | 1 | 54.26 | 5.36 | 1.53 | 0.06 | 6a94163 |
V100 | AVX2 CUDA | medium-q5_0 | 1 | 1 | 56.13 | 5.01 | 1.04 | 0.07 | 6a94163 |
V100 | AVX2 CUDA | large-v2 | 1 | 1 | 94.48 | 7.80 | 2.18 | 0.10 | 6a94163 |
V100 | AVX2 CUDA | large-v2-q5_0 | 1 | 1 | 93.55 | 6.98 | 1.51 | 0.11 | 6a94163 |
V100 | AVX2 CUDA | large-v3-turbo | 1 | 1 | 77.11 | 1.27 | 0.39 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | large-v3-turbo-q5_0 | 1 | 1 | 80.22 | 1.10 | 0.31 | 0.02 | 6a94163 |
Flash Attention OFF:
GPU | Config | Model | Th | FA | Enc. | Dec. | Bch5 | PP | Commit |
---|---|---|---|---|---|---|---|---|---|
V100 | AVX2 CUDA | tiny | 1 | 0 | 6.03 | 1.09 | 0.31 | 0.01 | 6a94163 |
V100 | AVX2 CUDA | tiny-q5_1 | 1 | 0 | 6.10 | 1.13 | 0.26 | 0.01 | 6a94163 |
V100 | AVX2 CUDA | base | 1 | 0 | 11.02 | 1.65 | 0.46 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | base-q5_1 | 1 | 0 | 11.18 | 1.74 | 0.39 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | small | 1 | 0 | 31.05 | 3.10 | 0.85 | 0.04 | 6a94163 |
V100 | AVX2 CUDA | small-q5_1 | 1 | 0 | 31.75 | 3.12 | 0.71 | 0.04 | 6a94163 |
V100 | AVX2 CUDA | medium | 1 | 0 | 83.99 | 6.40 | 1.82 | 0.09 | 6a94163 |
V100 | AVX2 CUDA | medium-q5_0 | 1 | 0 | 85.90 | 6.11 | 1.42 | 0.10 | 6a94163 |
V100 | AVX2 CUDA | large-v2 | 1 | 0 | 139.13 | 9.05 | 2.67 | 0.14 | 6a94163 |
V100 | AVX2 CUDA | large-v2-q5_0 | 1 | 0 | 142.98 | 8.47 | 2.06 | 0.16 | 6a94163 |
V100 | AVX2 CUDA | large-v3-turbo | 1 | 0 | 126.75 | 1.51 | 0.45 | 0.02 | 6a94163 |
V100 | AVX2 CUDA | large-v3-turbo-q5_0 | 1 | 0 | 129.91 | 1.30 | 0.35 | 0.03 | 6a94163 |
For reference, here is the performance for v1.6.0
What's Changed
- whisper: use global cache for sin/cos vals and Hann window by @iboB in #2194
- Add Installation Instructions for Conan to README by @czoido in #2189
- conan badge by @MartinDelille in #2196
- Remove
speed_up
andphase_vocoder*
functions by @iboB in #2198 - Add CUDA-specific compilation of mel spectrograms by @iboB in #2206
- whisper : calculate mel spectrogram directly into a ggml_tensor by @iboB in #2208
- whisper : fixes by @ggerganov in #2217
- whisper : auto-grow working areas for mel_calc_cuda by @iboB in #2227
- ci : fix CUDA builds by @ggerganov in #2232
- cuda : fix bounds check for src0 rows in MMVQ kernel by @ggerganov in #2231
- cuda : fix HIPBLAS build by @ggerganov in #2234
- whisper : use ggml-cuda in mel calc, set appropriate device by @iboB in #2236
- sync : ggml by @ggerganov in #2237
- sync : ggml-blas by @ggerganov in #2238
- whisper: optimize fft() function by @mkycoder in #2242
- whisper : reorganize source code + improve CMake by @ggerganov in #2256
- server: add inference path option to make API compatible with openai client by @eschmidbauer in #2270
- sync : ggml by @ggerganov in #2288
- cmake : allow external ggml by @iboB in #2290
- cmake : use WHISPER_EXTRA_FLAGS by @ggerganov in #2294
- Fix DTW assert by @arizhih in #2299
- whisper: use vulkan as gpu backend when available by @mstephenson6 in #2302
- whisper : handle empty mel by @ggerganov in #2324
- Whisper compiles in xcode after generation via cmake by @davens in #2311
- sync : ggml by @ggerganov in #2342
- sync : ggml by @ggerganov in #2343
- cann: add Ascend NPU support by @MengqingCao in #2336
- Use colorblind friendly TTY color scheme by @jart in #2360
- Fix broken links in README.md by @ericcurtin in #2358
- sync : ggml by @ggerganov in #2367
- Fix broken links in implementation details section by @stormofice in #2382
- Fixes typo by @ivoputzer in #2383
- Update the position of bench.py in README.md by @wa008 in #2386
- Add support for wget2 for fedora by @bradmurray-dt in #2387
- sync : ggml by @ggerganov in #2391
- feat(go binding): add beamsize/entropythold/maxcontext to context interface by @hsinhoyeh in #2350
- Fix broken link in README.md and remove invalid flag from Python example by @UsernamesLame in #2396
- Set MSVC to use UTF-8 on source files by @drasticactions in #2346
- sync : ggml by @ggerganov in #2401
- cmake: Fix libdir value in pkgconfig file by @philn in #2407
- [CANN] Add Ascend NPU instructions by @MengqingCao in #2410
- Fixed go cuda bindings building π by @Binozo in #2416
- server: Use OS-generated temp file name for ffmpeg converted files by @teejae in #2419
- Add tests and updates for go bindings by @Stavrospanakakis in #2425
- Added libsdl2-dev for container builds. by @jboero in #2424
- Added temperature options for Go bindings by @Binozo in #2417
- sync : ggml and llama.cpp by @ggerganov in #2429
- Fix references to download-ggml-model.sh by @WhyNotHugo in #2427
- whisper : add large-v3-turbo by @ggerganov in #2440
- Fix: overwrite leftover ffmpeg temp file if a prior conversion failed by @gilbertgong in #2431
- sync : ggml by @ggerganov in #2444
- update dr_wav.h to newer version to fix _MSC_VER macro issues by @RahulVadhyar in #2449
- whisper : fix excessive memory usage by @ggerganov in #2443
- sync : ggml + llama.cpp by @ggerganov in #2455
New Contributors
- @czoido made their first contribution in #2189
- @MartinDelille made their first contribution in #2196
- @mkycoder made their first contribution in #2242
- @arizhih made their first contribution in #2299
- @mstephenson6 made their first contribution in #2302
- @davens made their first contribution in #2311
- @MengqingCao made their first contribution in #2336
- @jart made their first contribution in #2360
- @ericcurtin made their first contribution in #2358
- @stormofice made their first contribution in #2382
- @ivoputzer made their first contribution in #2383
- @wa008 made their first contribution in #2386
- @hsinhoyeh made their first contribution in #2350
- @UsernamesLame made their first contribution in #2396
- @drasticactions made their first contribution in #2346
- @Binozo made their first contribution in #2416
- @teejae made their first contribution in #2419
- @Stavrospanakakis made their first contribution in #2425
- @jboero made their first contribution in #2424
- @WhyNotHugo made their first contribution in #2427
- @gilbertgong made their first contribution in #2431
- @RahulVadhyar made their first contribution in #2449
Full Changelog: v1.6.2...v1.7.0
Binaries
https://github.com/ggerganov/whisper.cpp/actions/runs/11193706782