Model libraries are stored in the format:
{model_name}/{model_name}-{quantization}-{metadata}-{platform}.{suffix}
Metadata:
ctx
: context window sizesw
: sliding window sizecs
: prefill chunk size
For default configurations of metadata, we do not include that in the file name. We also do not include prefill chunk size if it is the same as the context window size or sliding window size (the default choice).
Context Window Size | Sliding Window Size | Prefill Chunk Size | |
---|---|---|---|
Llama-3-8b-Instruct | 8192 | N/A | 1024 |
Llama-3-70b-Instruct | 8192 | N/A | 1024 |
Llama-2-7b-chat-hf | 4096 | N/A | 4096 |
Llama-2-13b-chat-hf | 4096 | N/A | 4096 |
Llama-2-70b-chat-hf | 4096 | N/A | 4096 |
Mistral-7B-Instruct-v0.2 | N/A | 4096 | 4096 |
RedPajama-INCITE-Chat-3B-v1 | 2048 | N/A | 2048 |
phi-2 | 2048 | N/A | 2048 |
phi-1_5 | 2048 | N/A | 2048 |
gpt2 | 1024 | N/A | 1024 |
gpt2-medium | 1024 | N/A | 1024 |