[ICML 2024] CLLMs: Consistency Large Language Models
-
Updated
Nov 16, 2024 - Python
[ICML 2024] CLLMs: Consistency Large Language Models
Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)
[NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi Yuan, Dachuan Shi, Roman Yakunin, and Yingyan (Celine) Lin.
Add a description, image, and links to the efficient-llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the efficient-llm-inference topic, visit your repo's landing page and select "manage topics."