Awesom long-context llm

Awesom long-context llm
- Survey
- Papers
- Projects
- Other
- Extra reference

Survey

In-Context Learning with Long-Context Models: An In-Depth Exploration, arXiv, 2405.00200, arxiv, pdf, cication: -1

Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig
Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding, arXiv, 2312.17044, arxiv, pdf, cication: -1

Liang Zhao, Xiaocheng Feng, Xiachong Feng, Bing Qin, Ting Liu · (jiqizhixin)
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey, arXiv, 2311.12351, arxiv, pdf, cication: -1

Yunpeng Huang, Jingwei Xu, Zixu Jiang, Junyu Lai, Zenan Li, Yuan Yao, Taolue Chen, Lijuan Yang, Zhou Xin, Xiaoxing Ma · (long-llms-learning - Strivin0311)

· (mp.weixin.qq)
The Transformer Family | Lil'Log
The Transformer Family Version 2.0 | Lil'Log
Attention? Attention! | Lil'Log

Evaluation

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?, arXiv, 2407.11963, arxiv, pdf, cication: -1

Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen · (opencompass - open-compass)
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems, arXiv, 2407.01370, arxiv, pdf, cication: -1

Philippe Laban, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu
Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models, arXiv, 2403.11802, arxiv, pdf, cication: -1

Mingyang Song, Mao Zheng, Xuan Luo · (Counting-Stars - nick7nlp) · (mp.weixin.qq)
Evaluating Very Long-Term Conversational Memory of LLM Agents, arXiv, 2402.17753, arxiv, pdf, cication: -1

Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang

Papers

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs, arXiv, 2408.07055, arxiv, pdf, cication: -1

Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li · (LongWriter - THUDM)
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities, arXiv, 2407.14482, arxiv, pdf, cication: -1

Peng Xu, Wei Ping, Xianchao Wu, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro
Human-like Episodic Memory for Infinite Context LLMs, arXiv, 2407.09450, arxiv, pdf, cication: -1

Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang
Associative Recurrent Memory Transformer, arXiv, 2407.04841, arxiv, pdf, cication: -1

Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP, arXiv, 2407.00402, arxiv, pdf, cication: -1

Omer Goldman, Alon Jacovi, Aviv Slobodkin, Aviya Maimon, Ido Dagan, Reut Tsarfaty
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers, arXiv, 2406.16747, arxiv, pdf, cication: -1

Chao Lou, Zixia Jia, Zilong Zheng, Kewei Tu
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon, arXiv, 2406.17746, arxiv, pdf, cication: -1

USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?, arXiv, 2406.13121, arxiv, pdf, cication: -1

Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia · (loft - google-deepmind)
THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation, arXiv, 2406.10996, arxiv, pdf, cication: -1

Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo · (theanine-693b0.web)
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM, arXiv, 2406.06110, arxiv, pdf, cication: -1

Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang · (RCC_Transformer - WUHU-G)
Contextual Position Encoding: Learning to Count What's Important, arXiv, 2405.18719, arxiv, pdf, cication: -1

Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar
Are Long-LLMs A Necessity For Long-Context Tasks?, arXiv, 2405.15318, arxiv, pdf, cication: -1

Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou
Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis, arXiv, 2405.08944, arxiv, pdf, cication: -1

Yao Fu
Extending Llama-3's Context Ten-Fold Overnight, arXiv, 2404.19553, arxiv, pdf, cication: -1

Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou · (FlagEmbedding - FlagOpen)
Make Your LLM Fully Utilize the Context, arXiv, 2404.16811, arxiv, pdf, cication: -1

Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou · (FILM - microsoft)
SnapKV: LLM Knows What You are Looking for Before Generation, arXiv, 2404.14469, arxiv, pdf, cication: -1

Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen · (SnapKV - FasterDecoding)
LongEmbed: Extending Embedding Models for Long Context Retrieval, arXiv, 2404.12096, arxiv, pdf, cication: -1

Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li · (LongEmbed - dwzhu-pku)
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length, arXiv, 2404.08801, arxiv, pdf, cication: -1

Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou · (megalodon - XuezheMax)
TransformerFAM: Feedback attention is working memory, arXiv, 2404.09173, arxiv, pdf, cication: -1

Dongseong Hwang, Weiran Wang, Zhuoyuan Huo, Khe Chai Sim, Pedro Moreno Mengibar
LLoCO: Learning Long Contexts Offline, arXiv, 2404.07979, arxiv, pdf, cication: -1

Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa · (lloco - jeffreysijuntan)
RULER: What's the Real Context Size of Your Long-Context Language Models?, arXiv, 2404.06654, arxiv, pdf, cication: -1

Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Boris Ginsburg
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention, arXiv, 2404.07143, arxiv, pdf, cication: -1

Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal
Long-context LLMs Struggle with Long In-context Learning, arXiv, 2404.02060, arxiv, pdf, cication: -1

Tianle Li, Ge Zhang, Quy Duc Do, Xiang Yue, Wenhu Chen
- the LLMs perform relatively well under the token length of 20K. However, after the context window exceeds 20K, most LLMs except GPT-4 will dip dramatically.
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences, arXiv, 2403.09347, arxiv, pdf, cication: -1

Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su
- optimizes distributed attention in Transformer-based models for long sequences, cutting communication overhead by 40% and doubling processing speed on GPUs.
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context, arXiv, 2403.05530, arxiv, pdf, cication: -1

Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser
Resonance RoPE: Improving Context Length Generalization of Large Language Models, arXiv, 2403.00071, arxiv, pdf, cication: -1

Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu
Long-Context Language Modeling with Parallel Context Encoding, arXiv, 2402.16617, arxiv, pdf, cication: -1

Howard Yen, Tianyu Gao, Danqi Chen · (qbitai)

· (cepe - princeton-nlp)
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts, arXiv, 2402.09727, arxiv, pdf, cication: -1

Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer · (mp.weixin.qq)
Training-Free Long-Context Scaling of Large Language Models, arXiv, 2402.17463, arxiv, pdf, cication: -1

Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong · (ChunkLlama - HKUNLP)
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens, arXiv, 2402.13753, arxiv, pdf, cication: -1

Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens, arXiv, 2402.13718, arxiv, pdf, cication: -1

Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration, arXiv, 2402.11550, arxiv, pdf, cication: -1

Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang
InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory, arXiv, 2402.04617, arxiv, pdf, cication: -1

Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Song Han, Maosong Sun

· (InfLLM - thunlp) · (mp.weixin.qq)
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss, arXiv, 2402.10790, arxiv, pdf, cication: -1

Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev
Data Engineering for Scaling Language Models to 128K Context, arXiv, 2402.10171, arxiv, pdf, cication: -1

Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng · (long-context-data-engineering - franxyao)
Transformers Can Achieve Length Generalization But Not Robustly, arXiv, 2402.09371, arxiv, pdf, cication: -1

Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization, arXiv, 2401.18079, arxiv, pdf, cication: -1

Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami
Long-Context-Data-Engineering - FranxYao

Implementation of paper Data Engineering for Scaling Language Models to 128K Context
LongAlign: A Recipe for Long Context Alignment of Large Language Models, arXiv, 2401.18058, arxiv, pdf, cication: -1

Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, Juanzi Li · (LongAlign - THUDM)
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation, arXiv, 2401.11504, arxiv, pdf, cication: -1

Y. Wang, D. Ma, D. Cai · (zhuanlan.zhihu)
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models, arXiv, 2401.06951, arxiv, pdf, cication: -1

Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su
Extending LLMs' Context Window with 100 Samples, arXiv, 2401.07004, arxiv, pdf, cication: -1

Yikai Zhang, Junlong Li, Pengfei Liu · (Entropy-ABF - GAIR-NLP)
Transformers are Multi-State RNNs, arXiv, 2401.06104, arxiv, pdf, cication: -1

Matanel Oren, Michael Hassid, Yossi Adi, Roy Schwartz · (TOVA - schwartz-lab-NLP)
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models, arXiv, 2401.04658, arxiv, pdf, cication: -1

Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong · (lightning-attention - OpenNLPLab)
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache, arXiv, 2401.02669, arxiv, pdf, cication: -1

Bin Lin, Tao Peng, Chen Zhang, Minmin Sun, Lanbo Li, Hanyu Zhao, Wencong Xiao, Qi Xu, Xiafei Qiu, Shen Li
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning, arXiv, 2401.01325, arxiv, pdf, cication: -1

Hongye Jin, Xiaotian Han, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu · (qbitai)
Cached Transformers: Improving Transformers with Differentiable Memory Cache, arXiv, 2312.12742, arxiv, pdf, cication: -1

Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo
Extending Context Window of Large Language Models via Semantic Compression, arXiv, 2312.09571, arxiv, pdf, cication: -1

Weizhi Fei, Xueyan Niu, Pingyi Zhou, Lu Hou, Bo Bai, Lei Deng, Wei Han
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention, arXiv, 2312.08618, arxiv, pdf, cication: -1

Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu
Ultra-Long Sequence Distributed Transformer, arXiv, 2311.02382, arxiv, pdf, cication: -1

Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley
HyperAttention: Long-context Attention in Near-Linear Time, arXiv, 2310.05869, arxiv, pdf, cication: 2

Insu Han, Rajesh Jayaram, Amin Karbasi, Vahab Mirrokni, David P. Woodruff, Amir Zandieh
CLEX: Continuous Length Extrapolation for Large Language Models, arXiv, 2310.16450, arxiv, pdf, cication: -1

Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing
TRAMS: Training-free Memory Selection for Long-range Language Modeling, arXiv, 2310.15494, arxiv, pdf, cication: -1

Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi
Ring Attention with Blockwise Transformers for Near-Infinite Context, arXiv, 2310.01889, arxiv, pdf, cication: 6

Hao Liu, Matei Zaharia, Pieter Abbeel · (RingAttention - lhao499)
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading, arXiv, 2310.05029, arxiv, pdf, cication: -1

Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz · (mp.weixin.qq)
Scaling Laws of RoPE-based Extrapolation, arXiv, 2310.05209, arxiv, pdf, cication: -1

Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin · (qbitai)
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading, arXiv, 2310.05029, arxiv, pdf, cication: -1

Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz
Ring Attention with Blockwise Transformers for Near-Infinite Context, arXiv, 2310.01889, arxiv, pdf, cication: -1

Hao Liu, Matei Zaharia, Pieter Abbeel
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation, arXiv, 2310.08185, arxiv, pdf, cication: -1

Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei
CoCA: Fusing position embedding with Collinear Constrained Attention for fine-tuning free context window extending, arXiv, 2309.08646, arxiv, pdf, cication: -1

Shiyi Zhu, Jing Ye, Wei Jiang, Qi Zhang, Yifan Wu, Jianguo Li · (Collinear-Constrained-Attention - codefuse-ai) · (jiqizhixin)
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training, arXiv, 2309.10400, arxiv, pdf, cication: -1

Dawei Zhu, Nan Yang, Liang Wang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li · (PoSE - dwzhu-pku)
Effective Long-Context Scaling of Foundation Models, arXiv, 2309.16039, arxiv, pdf, cication: 1

Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz · (qbitai)
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models, arXiv, 2308.16137, arxiv, pdf, cication: 3

Chi Han, Qifan Wang, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models, arXiv, 2309.14509, arxiv, pdf, cication: -1

Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Shuaiwen Leon Song, Samyam Rajbhandari, Yuxiong He
YaRN: Efficient Context Window Extension of Large Language Models, arXiv, 2309.00071, arxiv, pdf, cication: 9

Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole · (yarn - jquesnelle) · (jiqizhixin)
In-context Autoencoder for Context Compression in a Large Language Model, arXiv, 2307.06945, arxiv, pdf, cication: 4

Tao Ge, Jing Hu, Lei Wang, Xun Wang, Si-Qing Chen, Furu Wei
Focused Transformer: Contrastive Training for Context Scaling, arXiv, 2307.03170, arxiv, pdf, cication: 12

Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś
Lost in the Middle: How Language Models Use Long Contexts, arXiv, 2307.03172, arxiv, pdf, cication: 64

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang
LongNet: Scaling Transformers to 1,000,000,000 Tokens, arXiv, 2307.02486, arxiv, pdf, cication: 15

Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei
Extending Context Window of Large Language Models via Positional Interpolation, arXiv, 2306.15595, arxiv, pdf, cication: 36

Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian · (qbitai)
The Impact of Positional Encoding on Length Generalization in Transformers, arXiv, 2305.19466, arxiv, pdf, cication: 5

Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
Long-range Language Modeling with Self-retrieval, arXiv, 2306.13421, arxiv, pdf, cication: 3

Ohad Rubin, Jonathan Berant
Block-State Transformers, arXiv, 2306.09539, arxiv, pdf, cication: 2

Mahan Fathi, Jonathan Pilault, Orhan Firat, Christopher Pal, Pierre-Luc Bacon, Ross Goroshin
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models, arXiv, 2306.15626, arxiv, pdf, cication: 14

Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar
GLIMMER: generalized late-interaction memory reranker, arXiv, 2306.10231, arxiv, pdf, cication: 1

Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie
Augmenting Language Models with Long-Term Memory, arXiv, 2306.07174, arxiv, pdf, cication: 7

Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei · (aka)
Sequence Parallelism: Long Sequence Training from System Perspective, arXiv, 2105.13120, arxiv, pdf, cication: 2

Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You

Projects

EasyContext - jzhang38

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware. · (twitter)
LLMLingua - microsoft

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
long-context - abacusai

This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
LLaMA rope_scaling
long_llama - cstankonrad

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Other

A failed experiment: Infini-Attention, and why we should keep trying?
Long Context RAG Performance of LLMs | Databricks Blog

· (x)
influential papers from the recent past, exploring efficient context-window increase of LLMs
Unlocking Longer Generation with Key-Value Cache Quantization
Unsloth - 4x longer context windows & 1.7x larger batch sizes
a certain pattern among many attention heads
How Do Language Models put Attention Weights over Long Context?
gemini use case
Understanding data influence on context scaling: a close look at baseline solution
Anthropic \ Long context prompting for Claude 2.1
The Secret Sauce behind 100K context window in LLMs: all tricks in one place | by Galina Alperovich | May, 2023 | GoPenAI
Extending Context is Hard | kaiokendev.github.io
一句话解锁100k+上下文大模型真实力，27分涨到98，GPT-4、Claude2.1适用 | 量子位
500万token巨兽，一次读完全套「哈利波特」！比ChatGPT长1000多倍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awesome_long_context_llm.md

awesome_long_context_llm.md

Awesom long-context llm

Survey

Evaluation

Papers

Projects

Other

Extra reference

Files

awesome_long_context_llm.md

Latest commit

History

awesome_long_context_llm.md

File metadata and controls

Awesom long-context llm

Survey

Evaluation

Papers

Projects

Other

Extra reference