Architecture | Models | Example HuggingFace Models |
---|---|---|
ChatGLMModel |
ChatGLM | |
GemmaForCausalLM |
Gemma | |
GPTNeoXForCausalLM |
Dolly | |
RedPajama | ||
LlamaForCausalLM |
Llama 3 | |
Llama 2 | ||
OpenLLaMA | ||
TinyLlama | ||
MistralForCausalLM |
Mistral | |
Notus | ||
Zephyr | ||
PhiForCausalLM |
Phi | |
QWenLMHeadModel |
Qwen |
The pipeline can work with other similar topologies produced by optimum-intel
with the same model signature. The model is required to have the following inputs after the conversion:
input_ids
contains the tokens.attention_mask
is filled with1
.beam_idx
selects beams.position_ids
(optional) encodes a position of currently generating token in the sequence and a singlelogits
output.
Note
Models should belong to the same family and have the same tokenizers.
Architecture | Example HuggingFace Models |
---|---|
Latent Consistency Model |
|
Stable Diffusion |
|
Stable Diffusion XL |
Architecture | Models | Example HuggingFace Models |
---|---|---|
LLaVA | LLaVA-v1.5 |
|
MiniCPMV | MiniCPM-V-2_6 |
Some models may require access request submission on the Hugging Face page to be downloaded.
If https://huggingface.co/ is down, the conversion step won't be able to download the models.