-
In-Context Learning with Long-Context Models: An In-Depth Exploration,
arXiv, 2405.00200
, arxiv, pdf, cication: -1Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig
-
Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding,
arXiv, 2312.17044
, arxiv, pdf, cication: -1Liang Zhao, Xiaocheng Feng, Xiachong Feng, Bing Qin, Ting Liu · (jiqizhixin)
-
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey,
arXiv, 2311.12351
, arxiv, pdf, cication: -1Yunpeng Huang, Jingwei Xu, Zixu Jiang, Junyu Lai, Zenan Li, Yuan Yao, Taolue Chen, Lijuan Yang, Zhou Xin, Xiaoxing Ma · (long-llms-learning - Strivin0311)
· (mp.weixin.qq)
-
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?,
arXiv, 2407.11963
, arxiv, pdf, cication: -1Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen · (opencompass - open-compass)
-
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems,
arXiv, 2407.01370
, arxiv, pdf, cication: -1Philippe Laban, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu
-
Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models,
arXiv, 2403.11802
, arxiv, pdf, cication: -1Mingyang Song, Mao Zheng, Xuan Luo · (Counting-Stars - nick7nlp) · (mp.weixin.qq)
-
Evaluating Very Long-Term Conversational Memory of LLM Agents,
arXiv, 2402.17753
, arxiv, pdf, cication: -1Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang
-
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs,
arXiv, 2408.07055
, arxiv, pdf, cication: -1Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li · (LongWriter - THUDM)
-
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities,
arXiv, 2407.14482
, arxiv, pdf, cication: -1Peng Xu, Wei Ping, Xianchao Wu, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro
-
Human-like Episodic Memory for Infinite Context LLMs,
arXiv, 2407.09450
, arxiv, pdf, cication: -1Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang
-
Associative Recurrent Memory Transformer,
arXiv, 2407.04841
, arxiv, pdf, cication: -1Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev
-
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP,
arXiv, 2407.00402
, arxiv, pdf, cication: -1Omer Goldman, Alon Jacovi, Aviv Slobodkin, Aviya Maimon, Ido Dagan, Reut Tsarfaty
-
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers,
arXiv, 2406.16747
, arxiv, pdf, cication: -1Chao Lou, Zixia Jia, Zilong Zheng, Kewei Tu
-
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon,
arXiv, 2406.17746
, arxiv, pdf, cication: -1USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke
-
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?,
arXiv, 2406.13121
, arxiv, pdf, cication: -1Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia · (loft - google-deepmind)
-
THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation,
arXiv, 2406.10996
, arxiv, pdf, cication: -1Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo · (theanine-693b0.web)
-
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM,
arXiv, 2406.06110
, arxiv, pdf, cication: -1Chensen Huang, Guibo Zhu, Xuepeng Wang, Yifei Luo, Guojing Ge, Haoran Chen, Dong Yi, Jinqiao Wang · (RCC_Transformer - WUHU-G)
-
Contextual Position Encoding: Learning to Count What's Important,
arXiv, 2405.18719
, arxiv, pdf, cication: -1Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar
-
Are Long-LLMs A Necessity For Long-Context Tasks?,
arXiv, 2405.15318
, arxiv, pdf, cication: -1Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou
-
Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis,
arXiv, 2405.08944
, arxiv, pdf, cication: -1Yao Fu
-
Extending Llama-3's Context Ten-Fold Overnight,
arXiv, 2404.19553
, arxiv, pdf, cication: -1Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou · (FlagEmbedding - FlagOpen)
-
Make Your LLM Fully Utilize the Context,
arXiv, 2404.16811
, arxiv, pdf, cication: -1Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou · (FILM - microsoft)
-
SnapKV: LLM Knows What You are Looking for Before Generation,
arXiv, 2404.14469
, arxiv, pdf, cication: -1Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen · (SnapKV - FasterDecoding)
-
LongEmbed: Extending Embedding Models for Long Context Retrieval,
arXiv, 2404.12096
, arxiv, pdf, cication: -1Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li · (LongEmbed - dwzhu-pku)
-
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length,
arXiv, 2404.08801
, arxiv, pdf, cication: -1Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou · (megalodon - XuezheMax)
-
TransformerFAM: Feedback attention is working memory,
arXiv, 2404.09173
, arxiv, pdf, cication: -1Dongseong Hwang, Weiran Wang, Zhuoyuan Huo, Khe Chai Sim, Pedro Moreno Mengibar
-
LLoCO: Learning Long Contexts Offline,
arXiv, 2404.07979
, arxiv, pdf, cication: -1Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa · (lloco - jeffreysijuntan)
-
RULER: What's the Real Context Size of Your Long-Context Language Models?,
arXiv, 2404.06654
, arxiv, pdf, cication: -1Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Boris Ginsburg
-
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention,
arXiv, 2404.07143
, arxiv, pdf, cication: -1Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal
-
Long-context LLMs Struggle with Long In-context Learning,
arXiv, 2404.02060
, arxiv, pdf, cication: -1Tianle Li, Ge Zhang, Quy Duc Do, Xiang Yue, Wenhu Chen
the LLMs perform relatively well under the token length of 20K. However, after the context window exceeds 20K, most LLMs except GPT-4 will dip dramatically.
-
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences,
arXiv, 2403.09347
, arxiv, pdf, cication: -1Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su
optimizes distributed attention in Transformer-based models for long sequences, cutting communication overhead by 40% and doubling processing speed on GPUs.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context,
arXiv, 2403.05530
, arxiv, pdf, cication: -1Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser
-
Resonance RoPE: Improving Context Length Generalization of Large Language Models,
arXiv, 2403.00071
, arxiv, pdf, cication: -1Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu
-
Long-Context Language Modeling with Parallel Context Encoding,
arXiv, 2402.16617
, arxiv, pdf, cication: -1Howard Yen, Tianyu Gao, Danqi Chen · (qbitai)
· (cepe - princeton-nlp)
-
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts,
arXiv, 2402.09727
, arxiv, pdf, cication: -1Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer · (mp.weixin.qq)
-
Training-Free Long-Context Scaling of Large Language Models,
arXiv, 2402.17463
, arxiv, pdf, cication: -1Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong · (ChunkLlama - HKUNLP)
-
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens,
arXiv, 2402.13753
, arxiv, pdf, cication: -1Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang
-
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens,
arXiv, 2402.13718
, arxiv, pdf, cication: -1Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu
-
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration,
arXiv, 2402.11550
, arxiv, pdf, cication: -1Jun Zhao, Can Zu, Hao Xu, Yi Lu, Wei He, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang
-
InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory,
arXiv, 2402.04617
, arxiv, pdf, cication: -1Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Song Han, Maosong Sun
· (InfLLM - thunlp) · (mp.weixin.qq)
-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss,
arXiv, 2402.10790
, arxiv, pdf, cication: -1Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev
-
Data Engineering for Scaling Language Models to 128K Context,
arXiv, 2402.10171
, arxiv, pdf, cication: -1Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng · (long-context-data-engineering - franxyao)
-
Transformers Can Achieve Length Generalization But Not Robustly,
arXiv, 2402.09371
, arxiv, pdf, cication: -1Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou
-
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization,
arXiv, 2401.18079
, arxiv, pdf, cication: -1Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami
-
Long-Context-Data-Engineering - FranxYao
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
-
LongAlign: A Recipe for Long Context Alignment of Large Language Models,
arXiv, 2401.18058
, arxiv, pdf, cication: -1Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, Juanzi Li · (LongAlign - THUDM)
-
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation,
arXiv, 2401.11504
, arxiv, pdf, cication: -1Y. Wang, D. Ma, D. Cai · (zhuanlan.zhihu)
-
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models,
arXiv, 2401.06951
, arxiv, pdf, cication: -1Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su
-
Extending LLMs' Context Window with 100 Samples,
arXiv, 2401.07004
, arxiv, pdf, cication: -1Yikai Zhang, Junlong Li, Pengfei Liu · (Entropy-ABF - GAIR-NLP)
-
Transformers are Multi-State RNNs,
arXiv, 2401.06104
, arxiv, pdf, cication: -1Matanel Oren, Michael Hassid, Yossi Adi, Roy Schwartz · (TOVA - schwartz-lab-NLP)
-
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models,
arXiv, 2401.04658
, arxiv, pdf, cication: -1Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong · (lightning-attention - OpenNLPLab)
-
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache,
arXiv, 2401.02669
, arxiv, pdf, cication: -1Bin Lin, Tao Peng, Chen Zhang, Minmin Sun, Lanbo Li, Hanyu Zhao, Wencong Xiao, Qi Xu, Xiafei Qiu, Shen Li
-
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning,
arXiv, 2401.01325
, arxiv, pdf, cication: -1Hongye Jin, Xiaotian Han, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu · (qbitai)
-
Cached Transformers: Improving Transformers with Differentiable Memory Cache,
arXiv, 2312.12742
, arxiv, pdf, cication: -1Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo
-
Extending Context Window of Large Language Models via Semantic Compression,
arXiv, 2312.09571
, arxiv, pdf, cication: -1Weizhi Fei, Xueyan Niu, Pingyi Zhou, Lu Hou, Bo Bai, Lei Deng, Wei Han
-
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention,
arXiv, 2312.08618
, arxiv, pdf, cication: -1Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu
-
Ultra-Long Sequence Distributed Transformer,
arXiv, 2311.02382
, arxiv, pdf, cication: -1Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley
-
HyperAttention: Long-context Attention in Near-Linear Time,
arXiv, 2310.05869
, arxiv, pdf, cication: 2Insu Han, Rajesh Jayaram, Amin Karbasi, Vahab Mirrokni, David P. Woodruff, Amir Zandieh
-
CLEX: Continuous Length Extrapolation for Large Language Models,
arXiv, 2310.16450
, arxiv, pdf, cication: -1Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling,
arXiv, 2310.15494
, arxiv, pdf, cication: -1Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi
-
Ring Attention with Blockwise Transformers for Near-Infinite Context,
arXiv, 2310.01889
, arxiv, pdf, cication: 6Hao Liu, Matei Zaharia, Pieter Abbeel · (RingAttention - lhao499)
-
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading,
arXiv, 2310.05029
, arxiv, pdf, cication: -1Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz · (mp.weixin.qq)
-
Scaling Laws of RoPE-based Extrapolation,
arXiv, 2310.05209
, arxiv, pdf, cication: -1Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin · (qbitai)
-
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading,
arXiv, 2310.05029
, arxiv, pdf, cication: -1Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz
-
Ring Attention with Blockwise Transformers for Near-Infinite Context,
arXiv, 2310.01889
, arxiv, pdf, cication: -1Hao Liu, Matei Zaharia, Pieter Abbeel
-
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation,
arXiv, 2310.08185
, arxiv, pdf, cication: -1Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei
-
CoCA: Fusing position embedding with Collinear Constrained Attention for fine-tuning free context window extending,
arXiv, 2309.08646
, arxiv, pdf, cication: -1Shiyi Zhu, Jing Ye, Wei Jiang, Qi Zhang, Yifan Wu, Jianguo Li · (Collinear-Constrained-Attention - codefuse-ai) · (jiqizhixin)
-
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training,
arXiv, 2309.10400
, arxiv, pdf, cication: -1Dawei Zhu, Nan Yang, Liang Wang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li · (PoSE - dwzhu-pku)
-
Effective Long-Context Scaling of Foundation Models,
arXiv, 2309.16039
, arxiv, pdf, cication: 1Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz · (qbitai)
-
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models,
arXiv, 2308.16137
, arxiv, pdf, cication: 3Chi Han, Qifan Wang, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models,
arXiv, 2309.14509
, arxiv, pdf, cication: -1Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Shuaiwen Leon Song, Samyam Rajbhandari, Yuxiong He
-
YaRN: Efficient Context Window Extension of Large Language Models,
arXiv, 2309.00071
, arxiv, pdf, cication: 9Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole · (yarn - jquesnelle) · (jiqizhixin)
-
In-context Autoencoder for Context Compression in a Large Language Model,
arXiv, 2307.06945
, arxiv, pdf, cication: 4Tao Ge, Jing Hu, Lei Wang, Xun Wang, Si-Qing Chen, Furu Wei
-
Focused Transformer: Contrastive Training for Context Scaling,
arXiv, 2307.03170
, arxiv, pdf, cication: 12Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś
-
Lost in the Middle: How Language Models Use Long Contexts,
arXiv, 2307.03172
, arxiv, pdf, cication: 64Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang
-
LongNet: Scaling Transformers to 1,000,000,000 Tokens,
arXiv, 2307.02486
, arxiv, pdf, cication: 15Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei
-
Extending Context Window of Large Language Models via Positional Interpolation,
arXiv, 2306.15595
, arxiv, pdf, cication: 36Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian · (qbitai)
-
The Impact of Positional Encoding on Length Generalization in Transformers,
arXiv, 2305.19466
, arxiv, pdf, cication: 5Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
-
Long-range Language Modeling with Self-retrieval,
arXiv, 2306.13421
, arxiv, pdf, cication: 3Ohad Rubin, Jonathan Berant
-
Block-State Transformers,
arXiv, 2306.09539
, arxiv, pdf, cication: 2Mahan Fathi, Jonathan Pilault, Orhan Firat, Christopher Pal, Pierre-Luc Bacon, Ross Goroshin
-
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models,
arXiv, 2306.15626
, arxiv, pdf, cication: 14Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar
-
GLIMMER: generalized late-interaction memory reranker,
arXiv, 2306.10231
, arxiv, pdf, cication: 1Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie
-
Augmenting Language Models with Long-Term Memory,
arXiv, 2306.07174
, arxiv, pdf, cication: 7Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei · (aka)
-
Sequence Parallelism: Long Sequence Training from System Perspective,
arXiv, 2105.13120
, arxiv, pdf, cication: 2Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You
-
EasyContext - jzhang38
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware. · (twitter)
-
LLMLingua - microsoft
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
-
long-context - abacusai
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
-
long_llama - cstankonrad
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
-
A failed experiment: Infini-Attention, and why we should keep trying?
-
Long Context RAG Performance of LLMs | Databricks Blog
· (x)
-
influential papers from the recent past, exploring efficient context-window increase of LLMs
-
Unlocking Longer Generation with Key-Value Cache Quantization
-
Unsloth - 4x longer context windows & 1.7x larger batch sizes
-
How Do Language Models put Attention Weights over Long Context?
-
Understanding data influence on context scaling: a close look at baseline solution