Better inference for M1(text to semantic) model #5

rasenganai · 2023-11-22T10:13:03Z

M1 is a decoder only model built on gpt, hence we can leverage the work on LLMS to speed up the model outputs.
KV caching and efficient attention working can be a start.

rasenganai added the enhancement New feature or request label Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better inference for M1(text to semantic) model #5

Better inference for M1(text to semantic) model #5

rasenganai commented Nov 22, 2023

Better inference for M1(text to semantic) model #5

Better inference for M1(text to semantic) model #5

Comments

rasenganai commented Nov 22, 2023