Skip to content

Latest commit

 

History

History
536 lines (323 loc) · 42.7 KB

readme.md

File metadata and controls

536 lines (323 loc) · 42.7 KB

Paper List for In-context Learning

Contents

Introduction

This is a paper list (working in progress) about In-context learning

Keywords Convention

abbreviation

section in our survey

main feature

conference

Papers

Survey

  1. A Survey for In-context Learning.

    Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, Zhifang Sui. [pdf], 2022.12,

Model Training for ICL

This section contains the pilot works that might contributes to the training strategies of ICL.

Pre-training

  1. MEND: meta demonstration distillation for efficient and effective in-context learning. Yichuan Li, Xiyao Ma, Sixing Lu, Kyumin Lee, Xiaohu Liu, Chenlei Guo. [pdf], [project], 2024.3,
  2. Pre-training to learn in context. Yuxian Gu, Li Dong, Furu Wei, Minlie Huang. [pdf], [project], 2023.7,
  3. In-context pretraining: Language modeling beyond document boundaries. Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis. [pdf], [project], 2023.7,

Warmup

  1. MetaICL: Learning to Learn In Context NAACL 2022 a pretrained language model is tuned to do in-context learning on a large set of training tasks..

    Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. [pdf], [project], 2021.10,

  2. Improving In-Context Few-Shot Learning via Self-Supervised Training..

    Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva. [pdf], [project], 2022.5,

  3. Calibrate Before Use: Improving Few-shot Performance of Language Models..

    Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh. [pdf], [project], 2021.2,

    • Using N/A string to calibrate LMs away from common token bias
  4. Symbol tuning improves in-context learning in language models. Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, Da Huang, Yi Tay, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma, Quoc V. Le. [pdf], [project], 2023.5,

  5. Fine-tune language models to approximate unbiased in-context learning. Timothy Chu, Zhao Song, Chiwun Yang. [pdf], 2023.10,

  6. ICL Markup: Structuring In-Context Learning using Soft-Token Tags. Marc-Etienne Brunet, Ashton Anderson, Richard Zemel. [pdf], 2023.12,

  7. Cross-task generalization via natural language crowdsourcing instructions. Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi.: [pdf], [project], 2022.5,

  8. Finetuned language models are zero-shot learners. Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], 2021.9,

  9. Scaling instruction-finetuned language models. Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei [pdf], [project], 2022.10,

  10. Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks. Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Atharva Naik, Arjun Ashok, Arut Selvan Dhanasekaran, Anjana Arunkumar, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Kuntal Kumar Pal, Maitreya Patel, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Savan Doshi, Shailaja Keyur Sampat, Siddhartha Mishra, Sujan Reddy A, Sumanta Patro, Tanay Dixit, Xudong Shen [pdf], [project], 2022.4,

Prompt Tuning for ICL

This section contains the pilot works that might contributes to the prompt selection and prompt formulation strategies of ICL.

  1. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model.

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung. [pdf], 2022.04,

    • how in-context learning performance changes as the training corpus varies, investigate the effects of the source and size of the pretraining corpus on in-context learning
  2. Chain of Thought Prompting Elicits Reasoning in Large Language Models.

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. [pdf], 2022.01,

  3. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.

    Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi. [pdf], 2022.05,

  4. Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator.

    Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee. [pdf], 2022.06,

  5. Iteratively Prompt Pre-trained Language Models for Chain of Thought.

    Boshi Wang, Xiang Deng, Huan Sun. [pdf], [project], 2022.03,

  6. Automatic Chain of Thought Prompting in Large Language Models.

    Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola. [pdf], [project], 2022.10,

  7. Learning To Retrieve Prompts for In-Context Learning NAACL 2022 Learn an example retriever via contrastive learning.

    Ohad Rubin, Jonathan Herzig, Jonathan Berant. [pdf], [project], 2022.12,

  8. Finetuned Language Models Are Zero-Shot Learners instruction tuning.

    Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], [project], 2021.09,

    • finetuning language models on a collection of tasks described via instructions
    • substantially improves zero-shot performance on unseen tasks
  9. Active Example Selection for In-Context Learning.

    Yiming Zhang, Shi Feng, Chenhao Tan. [pdf], [project], 2022.11,

  10. Prompting GPT-3 To Be Reliable

    Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [pdf], [project], 2022.10,

  11. An lnformation-theoretic Approach to Prompt Engineering Without Ground Truth Labels

    Taylor Sorensen, Joshua Robinson, Christopher Rytting, Alexander Shaw, Kyle Rogers, Alexia Delorey, Mahmoud Khalil, Nancy Fulda, David Wingate. [pdf], 2022.5,

  12. Self-adaptive In-context Learning

    Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong. [pdf], [project], 2022.12,

  13. Demystifying Prompts in Language Models via Perplexity Estimation

    Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer. [pdf], [project], 2022.12,

  14. Structured Prompting: Scaling In-Context Learning to 1,000 Examples.

    Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei. [pdf], [project], 2022.12.

  15. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], 2021.04,

  1. On the Relation between Sensitivity and Accuracy in In-context Learning.

Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He. [pdf], 2022.09,

  1. Can language models learn from explanations in context?.

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill. [pdf], 2022.04

  1. Prototypical Calibration for Few-shot Learning of Language Models Zhixiong Han, Yaru Hao, Li Dong, Furu Wei. [pdf], [project], 2022.05.

  2. Cross-Task Generalization via Natural Language Crowdsourcing Instructions.

Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi. [pdf], [project], 2022.03

  1. Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning.

    Xinyi Wang, Wanrong Zhu, William Yang Wang. [pdf], [project], 2023.01

Analysis of ICL

This section contains the pilot works that might contributes to the influence factors and working mechanism analysis of ICL.

Influence Factors for ICL

We discuss relevant research addressing what influences ICL performance, including factors both in the pretraining stage and in the inference stage.

Pre-training stage
  1. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model.

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woo-Myoung Park, Jung-Woo Ha, Nako Sung. [pdf], 2022.08,

  2. Data Distributional Properties Drive Emergent In-Context Learning in Transformers.

    Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

  3. The learnability of in-context learning. Noam Wies, Yoav Levine, Amnon Shashua. [pdf], 2023.3,

  4. Understanding in-context learning via supportive pre-training data. Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang. [pdf], 2023.6,

  5. Pretraining data mixtures enable narrow model selection capabilities in transformer models. Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni. [pdf], 2023.11,

  6. Pretraining task diversity and the emergence of non-bayesian in-context learning for regression. Allan Raventós, Mansheej Paul, Feng Chen, Surya Ganguli. [pdf], [project], 2023.6,

  7. Causallm is not optimal for in-context learning. Nan Ding, Tomer Levinboim, Jialin Wu, Sebastian Goodman, Radu Soricut. [pdf], 2023.8,

  8. Emergent Abilities of Large Language Models.

    Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. [pdf], 2022.07,

  9. Language models are few-shot learners. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei [pdf], 2020,5,

Inference Stage
  1. What Makes Good In-Context Examples for GPT-3? img

    Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen. [pdf], 2022.08, img img img

  2. Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters.

    Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun. [pdf], [project], 2022.12,

  3. Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale.

    Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth. [pdf], [project], 2022.12,

  4. Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations. , [project],

    Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], 2022.05,

  5. What in-context learning "learns" in-context: Disentangling task recognition and task learning. Jane Pan, Tianyu Gao, Howard Chen, Danqi Chen. [pdf], [project], 2023.5,

  6. Large language models can be lazy learners: Analyze shortcuts in in-context learning. Ruixiang Tang, Dehan Kong, Longtao Huang, Hui Xue [pdf], 2023.5,

  7. Larger language models do in-context learning differently. Jerry W. Wei, Jason Wei, Yi Tay, Dustin Tran, Albert Webson, Yifeng Lu, Xinyun Chen, Hanxiao Liu, Da Huang, Denny Zhou, Tengyu Ma. [pdf], 2023.3,

  8. Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant. [pdf], [project], 2022.12,

  9. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], 2021.4,

  10. Active example selection for in-context learning.

  • Yiming Zhang, Shi Feng, Chenhao Tan.* [pdf], [project],
  1. Lost in the middle: How language models use long contexts. Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang. [pdf], [project], 2023.7,

  2. Measuring inductive biases of in-context learning with underspecified demonstrations. Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He. [pdf], [project], 2023.5,

  3. In-context learning in large language models learns label relationships but is not conventional learning. Jannik Kossen, Tom Rainforth, Yarin Gal [pdf], 2023.7,

  4. How Does In-Context Learning Help Prompt Tuning?

    Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer. [pdf], 2023.02,

Working Mechanism of ICL

  1. In-context Learning and Induction Heads.

    Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah. [pdf], 2022.10,

  2. Birth of a transformer: A memory viewpoint. Alberto Bietti, Vivien Cabannes, Diane Bouchacourt, Hervé Jégou, Léon Bottou. [pdf], 2023.6,

  3. Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers.

    Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei. [pdf], [project], 2022.12

  4. The dual form of neural networks revisited: Connecting test time predictions to training patterns via spotlights of attention. Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber. [pdf], [project], 2022.2,

  5. The closeness of in-context learning and weight shifting for softmax regression. Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou. [pdf], 2023.4,

  6. In context learning for attention scheme: from single softmax regression to multiple softmax regression via a tensor trick. Yeqi Gao, Zhao Song, Shenghao Xie. [pdf], 2023.7,

  7. What and how does in-context learning learn? bayesian model averaging,parameterization, and generalization. Yufeng Zhang, Fengzhuo Zhang, Zhuoran Yang, Zhaoran Wang. [pdf], 2023.5,

  8. An information flow perspective for understanding in-context learning. Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun. [pdf], [project], 2023.5,

  9. An Explanation of In-context Learning as Implicit Bayesian Inference.

Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma. [pdf], [project], 2022.08,

  1. Data Distributional Properties Drive Emergent In-Context Learning in Transformers.

Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

  1. Transformers learn to implement preconditioned gradient descent for in-context learning. Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra. [pdf], 2023.6,

  2. In-context learning through the bayesian prism. Kabir Ahuja, Madhur Panwar, Navin Goyal. [pdf], 2023.6,

  3. What learning algorithm is in-context learning? Investigations with linear models. Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, Denny Zhou. [pdf], 2022.11,

  4. Transformers as statisticians: Provable in-context learning with in-context algorithm selection. Yu Bai, Fan Chen, Huan Wang, Caiming Xiong, Song Mei. [pdf], [project], 2023.6,

  5. Transformers learn higher-order optimization methods for in-context learning: A study with linear models. Deqing Fu, Tian-Qi Chen, Robin Jia, Vatsal Sharan. [pdf], [project], 2023.10,

  6. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes. Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant. [pdf], 2022.08,

  7. A theory of emergent in-context learning as implicit structure induction. Michael Hahn, Navin Goyal. [pdf], 2023.3,

  8. Explaining emergent in-context learning as kernel regression. Chi Han, Ziqi Wang, Han Zhao, Heng Ji. [pdf], 2023.5,

  9. A latent space theory for emergent abilities in large language models. Hui Jiang. [pdf], 2023.4,

  10. Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning. Yingcong Li, M. Emrullah Ildiz, Dimitris S. Papailiopoulos, Samet Oymak. [pdf], 2023.1

  11. One step of gradient descent is provably the optimal in-context learner with one layer of linear self-attention. Arvind Mahankali, Tatsunori B. Hashimoto, Tengyu Ma. [pdf], 2023.7,

  12. What in-context learning "learns" in-context: Disentangling task recognition and task learning. Jane Pan, Tianyu Gao, Howard Chen, Danqi Chen. [pdf], [project], 2023.5,

  13. Transformers learn in-context by gradient descent. von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov. [pdf], 2022.12,

  14. Do pretrained transformers learn in-context by gradient descent? Lingfeng Shen, Aayush Mishra, Daniel Khashabi. [pdf], 2023.10,

  15. Large language models are implicitly topic models: Explaining and finding good demonstrations for in-context learning. Xinyi Wang, Wanrong Zhu, William Yang Wang. [pdf], [project], 2023.1,

  16. Iterative Forward Tuning Boosts In-context Learning in Language Models. Jiaxi Yang, Binyuan Hui, Min Yang, Binhua Li, Fei Huang, Yongbin Li. [pdf], [project], [demo], 2023.05

  17. Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers Siyu Chen, Heejune Sheen, Tianhao Wang, Zhuoran Yang. [pdf], 2024.09

Evaluation and Resources

This section contains the pilot works that might contributes to the evaluation or resources of ICL.

  1. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

    Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt et. al.. [pdf], [project], 2022.06,

  2. SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Task.

    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit et. al.. [pdf], [project], 2022.04,

  3. Language Models are Multilingual Chain-of-Thought Reasoners.

    Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei. [pdf], 2022.10,

    • evaluate the reasoning abilities of large language models in multilingual settings, introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset into ten typologically diverse languages.
  4. Instruction Induction: From Few Examples to Natural Language Task Descriptions.

    Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy. [pdf], [project], 2022.05,

    • how to learn task instructions from input output demonstrations
  5. Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought2022.10.3

  6. What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations 2212.01692.pdf (arxiv.org)

  7. Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor.

    Or Honovich, Thomas Scialom, Omer Levy, Timo Schick. [pdf], [project], 2022.12,

  8. Self-Instruct: Aligning Language Model with Self Generated Instructions.

    Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. [pdf], [project], 2022.12,

  9. The Flan Collection: Designing Data and Methods for Effective.

    Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V. Le, Barret Zoph, Jason Wei, Adam Roberts. [pdf], [project], 2023.1,

Application

This section contains the pilot works that expands the application of ICL.

  1. Meta-learning via Language Model In-context Tuning.

    Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He. [pdf], [project], 2021.10,

  2. Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for Empathetic Dialogue Generation.

    Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi. [pdf], 2022.10,

  3. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models.

    Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown. [pdf], 2022.12,

  4. Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

  5. Prompt-Augmented Linear Probing: Scaling Beyond the Limit of Few-shot In-Context Learner.

    Hyunsoo Cho, Hyuhng Joon Kim, Junyeob Kim, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], 2022.12,

  6. In-Context Learning Unlocked for Diffusion Models Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou. [pdf], [project], 2023.5,

  7. Molecule Representation Fusion via In-Context Learning for Retrosynthetic Plannings Songtao Liu, Zhengkai Tu, Minkai Xu, Zuobai Zhang, Lu Lin, Rex Ying, Jian Tang, Peilin Zhao, Dinghao Wu. [pdf], [project], 2023.5,

This section contains the pilot works that points out the problems of ICL.

  1. The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design .

    Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua. [pdf], 2021.10,

Challenges and Future Directions

This section contains the pilot works that might contributes to the challenges and future directions of ICL.

Blogs

SEO is Dead, Long Live LLMO

How does in-context learning work? A framework for understanding the differences from traditional supervised learning

Extrapolating to Unnatural Language Processing with GPT-3's In-context Learning: The Good, the Bad, and the Mysterious

More Efficient In-Context Learning with GLaM

Open-source Toolkits

OpenICL [pdf], [project], 2022.03

OpenICL provides an easy interface for in-context learning, with many state-of-the-art retrieval and inference methods built in to facilitate systematic comparison of LMs and fast research prototyping. Users can easily incorporate different retrieval and inference methods, as well as different prompt instructions into their workflow.

Contribution

Please feel free to contribute and promote your awesome work or other related works here! If you recommend related works on ICL or make contributions on this repo, please provide your information (name, homepage) and we will add you to the contributor list😊.

Contributor list

We thank Damai Dai, Qingxiu Dong, Lei Li, Ce Zheng, Shihao Liang, Li Dong, Siyin Wang, Po-Chuan Chen for their repo contribution and paper recommendation.

Reference

Some papers are discussed in the following paper:

@misc{dong2022survey,
      title={A Survey for In-context Learning}, 
      author={Qingxiu Dong and Lei Li and Damai Dai and Ce Zheng and Zhiyong Wu and Baobao Chang and Xu Sun and Jingjing Xu and Lei Li and Zhifang Sui},
      year={2022},
      eprint={2301.00234},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}