Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 #7672
Job | Run time |
---|---|
2m 27s | |
1m 36s | |
1m 41s | |
1m 52s | |
6m 19s | |
1m 50s | |
4m 15s | |
1m 51s | |
5m 8s | |
6m 35s | |
17m 38s | |
1m 24s | |
6m 52s | |
2m 52s | |
5m 59s | |
2m 36s | |
4m 8s | |
2m 53s | |
13m 32s | |
3m 3s | |
2m 37s | |
15m 30s | |
3m 27s | |
2m 39s | |
2m 40s | |
2m 41s | |
5m 23s | |
0s | |
2h 9m 28s |