-
🌟 Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities,
arXiv, 2501.09686
, arxiv, pdf, cication: -1Fengli Xu, Qianyue Hao, Zefang Zong, ..., Chen Gao, Yong Li
-
🌟 Test-time Computing: from System-1 Thinking to System-2 Thinking,
arXiv, 2501.02497
, arxiv, pdf, cication: -1Yixin Ji, Juntao Li, Hai Ye, ..., Linjian Mo, Min Zhang · (Awesome_Test_Time_LLMs - Dereck0602)
-
A Survey on LLM Inference-Time Self-Improvement,
arXiv, 2412.14352
, arxiv, pdf, cication: -1Xiangjue Dong, Maria Teleki, James Caverlee
-
Iterate to Accelerate: A Unified Framework for Iterative Reasoning and Feedback Convergence,
arXiv, 2502.03787
, arxiv, pdf, cication: -1Jacob Fein-Ashley
· (reddit)
-
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning,
arXiv, 2502.03275
, arxiv, pdf, cication: -1DiJia Su, Hanlin Zhu, Yingchen Xu, ..., Yuandong Tian, Qinqing Zheng
-
🌟 LIMO: Less is More for Reasoning,
arXiv, 2502.03387
, arxiv, pdf, cication: -1Yixin Ye, Zhen Huang, Yang Xiao, ..., Shijie Xia, Pengfei Liu · (LIMO - GAIR-NLP)
-
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate,
arXiv, 2501.17703
, arxiv, pdf, cication: -1Yubo Wang, Xiang Yue, Wenhu Chen · (CritiqueFineTuning - TIGER-AI-Lab)
-
Reasoning Language Models: A Blueprint,
arXiv, 2501.11223
, arxiv, pdf, cication: -1Maciej Besta, Julia Barth, Eric Schreiber, ..., Hubert Niewiadomski, Torsten Hoefler
-
PokerBench: Training Large Language Models to become Professional Poker Players,
arXiv, 2501.08328
, arxiv, pdf, cication: -1Richard Zhuang, Akshat Gupta, Richard Yang, ..., Zhengyu Li, Gopala Anumanchipalli · (pokerbench - pokerllm)
-
🌟 OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking,
arXiv, 2501.09751
, arxiv, pdf, cication: -1Zekun Xi, Wenbiao Yin, Jizhan Fang, ..., Fei Huang, Huajun Chen · (zjunlp.github)
-
🌟 Evolving Deeper LLM Thinking,
arXiv, 2501.09891
, arxiv, pdf, cication: -1Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, ..., Dale Schuurmans, Xinyun Chen
-
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong,
arXiv, 2501.09775
, arxiv, pdf, cication: -1Tairan Fu, Javier Conde, Gonzalo Martínez, ..., María Grandury, Pedro Reviriego
-
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains,
arXiv, 2501.05707
, arxiv, pdf, cication: -1Vighnesh Subramaniam, Yilun Du, Joshua B. Tenenbaum, ..., Shuang Li, Igor Mordatch · (llm-multiagent-ft.github)
-
Towards AI Superhuman Reasoning for Math and beyond
· (youtu)
-
Aligning with Logic: Measuring, Evaluating and Improving Logical Consistency in Large Language Models,
arXiv, 2410.02205
, arxiv, pdf, cication: -1Yinhong Liu, Zhijiang Guo, Tianya Liang, ..., Ivan Vulić, Nigel Collier
-
🌟 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking,
arXiv, 2403.09629
, arxiv, pdf, cication: -1Eric Zelikman, Georges Harik, Yijia Shao, ..., Nick Haber, Noah D. Goodman
-
🌟 Token-Budget-Aware LLM Reasoning,
arXiv, 2412.18547
, arxiv, pdf, cication: -1Tingxu Han, Chunrong Fang, Shiyu Zhao, ..., Zhenyu Chen, Zhenting Wang · (TALE - GeniusHTX)
· (𝕏)
-
🌟 B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners,
arXiv, 2412.17256
, arxiv, pdf, cication: -1Weihao Zeng, Yuzhen Huang, Lulu Zhao, ..., Zifei Shan, Junxian He
-
Deliberation in Latent Space via Differentiable Cache Augmentation,
arXiv, 2412.17747
, arxiv, pdf, cication: -1Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, ..., Jun Xie, Arthur Szlam
-
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning,
arXiv, 2412.15797
, arxiv, pdf, cication: -1Sungjin Park, Xiao Liu, Yeyun Gong, ..., Edward Choi
-
🌟 Chain-of-Thought Reasoning Without Prompting,
arXiv, 2402.10200
, arxiv, pdf, cication: -1Xuezhi Wang, Denny Zhou · (𝕏)
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models,
arXiv, 2412.11605
, arxiv, pdf, cication: -1Jiale Cheng, Xiao Liu, Cunxiang Wang, ..., Hongning Wang, Minlie Huang · (SPaR - thu-coai)
-
🌟 Are Your LLMs Capable of Stable Reasoning?,
arXiv, 2412.13147
, arxiv, pdf, cication: -1Junnan Liu, Hongwei Liu, Linchen Xiao, ..., Songyang Zhang, Kai Chen · (GPassK. - open-compass)
-
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations,
arXiv, 2412.13171
, arxiv, pdf, cication: -1Jeffrey Cheng, Benjamin Van Durme
-
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks,
arXiv, 2412.15204
, arxiv, pdf, cication: -1Yushi Bai, Shangqing Tu, Jiajie Zhang, ..., Jie Tang, Juanzi Li · (longbench2.github)
-
RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models,
arXiv, 2412.02830
, arxiv, pdf, cication: -1Hieu Tran, Zonghai Yao, Junda Wang, ..., Zhichao Yang, Hong Yu
-
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models,
arXiv, 2412.02674
, arxiv, pdf, cication: -1Yuda Song, Hanlin Zhang, Carson Eisenach, ..., Dean Foster, Udaya Ghai · (𝕏)
-
Frontier Models are Capable of In-context Scheming,
arXiv, 2412.04984
, arxiv, pdf, cication: -1Alexander Meinke, Bronson Schoen, Jérémy Scheurer, ..., Rusheb Shah, Marius Hobbhahn
-
🌟 Training Large Language Models to Reason in a Continuous Latent Space,
arXiv, 2412.06769
, arxiv, pdf, cication: -1Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, ..., Jason Weston, Yuandong Tian · (𝕏)
-
MALT: Improving Reasoning with Multi-Agent LLM Training,
arXiv, 2412.01928
, arxiv, pdf, cication: -1Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, ..., Ronald Clark, Christian Schroeder de Witt
-
Reverse Thinking Makes LLMs Stronger Reasoners,
arXiv, 2411.19865
, arxiv, pdf, cication: -1Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, ..., Chen-Yu Lee, Tomas Pfister
-
🌟 Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS,
arXiv, 2411.18478
, arxiv, pdf, cication: -1Jinyang Wu, Mingkuan Feng, Shuai Zhang, ..., Zengqi Wen, Jianhua Tao · (arxiv) · (jinyangwu.github)
-
DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power 𝕏
-
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games,
arXiv, 2411.13543
, arxiv, pdf, cication: -1Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, ..., Jack Parker-Holder, Tim Rocktäschel
-
🌟 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding,
arXiv, 2411.04282
, arxiv, pdf, cication: -1Haolin Chen, Yihao Feng, Zuxin Liu, ..., Caiming Xiong, Huan Wang · (LaTRO - SalesforceAIResearch)
-
Large Language Models Can Self-Improve in Long-context Reasoning,
arXiv, 2411.08147
, arxiv, pdf, cication: -1Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam
-
🌟 Combining Induction and Transduction for Abstract Reasoning,
arXiv, 2411.02272
, arxiv, pdf, cication: -1Wen-Ding Li, Keya Hu, Carter Larsen, ..., Yewen Pu, Kevin Ellis · (𝕏)
-
🌟 The Surprising Effectiveness ofTest-Time Training for Abstract Reasoning
-
Can Language Models Learn to Skip Steps?,
arXiv, 2411.01855
, arxiv, pdf, cication: -1Tengxiao Liu, Qipeng Guo, Xiangkun Hu, ..., Xipeng Qiu, Zheng Zhang
-
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization,
arXiv, 2410.21411
, arxiv, pdf, cication: -1Wanhua Li, Zibin Meng, Jiawei Zhou, ..., Chuang Gan, Hanspeter Pfister · (SocialGPT - Mengzibin)
-
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents,
arXiv, 2410.22476
, arxiv, pdf, cication: -1Ankan Mullick, Sombit Bose, Abhilash Nandy, ..., Gajula Sai Chaitanya, Pawan Goyal
-
Improve Vision Language Model Chain-of-thought Reasoning,
arXiv, 2410.16198
, arxiv, pdf, cication: -1Ruohong Zhang, Bowen Zhang, Yanghao Li, ..., Ruoming Pang, Yiming Yang
· (LLaVA-Reasoner-DPO - RifleZhang)
-
MathArena: Evaluating LLMs on Uncontaminated Math Competitions
-
🌟 Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2,
arXiv, 2502.03544
, arxiv, pdf, cication: -1Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, ..., Quoc V. Le, Thang Luong · (𝕏)
-
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback,
arXiv, 2501.10799
, arxiv, pdf, cication: -1Yen-Ting Lin, Di Jin, Tengyu Xu, ..., Hao Ma, Han Fang
-
🌟 The Lessons of Developing Process Reward Models in Mathematical Reasoning,
arXiv, 2501.07301
, arxiv, pdf, cication: -1Zhenru Zhang, Chujie Zheng, Yangzhen Wu, ..., Jingren Zhou, Junyang Lin
-
🌟 BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning,
arXiv, 2501.03226
, arxiv, pdf, cication: -1Beichen Zhang, Yuhong Liu, Xiaoyi Dong, ..., Dahua Lin, Jiaqi Wang · (BoostStep - beichenzbc)
-
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics,
arXiv, 2501.04686
, arxiv, pdf, cication: -1Ruilin Luo, Zhuofan Zheng, Yifan Wang, ..., Jin Zeng, Yujiu Yang · (ursa-math.github)
-
🌟 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models,
arXiv, 2402.03300
, arxiv, pdf, cication: 155Zhihong Shao, Peiyi Wang, Qihao Zhu, ..., Y. Wu, Daya Guo · (𝕏)
-
Slow Perception: Let's Perceive Geometric Figures Step-by-step,
arXiv, 2412.20631
, arxiv, pdf, cication: -1Haoran Wei, Youyang Yin, Yumeng Li, ..., Zheng Ge, Xiangyu Zhang · (Slow-Perception - Ucas-HaoranWei)
-
HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving,
arXiv, 2412.20735
, arxiv, pdf, cication: -1Yang Li, Dong Du, Linfeng Song, ..., Tao Yang, Haitao Mi
-
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling,
arXiv, 2412.15084
, arxiv, pdf, cication: -1Zihan Liu, Yang Chen, Mohammad Shoeybi, ..., Bryan Catanzaro, Wei Ping · (research.nvidia)
-
Formal Mathematical Reasoning: A New Frontier in AI,
arXiv, 2412.16075
, arxiv, pdf, cication: -1Kaiyu Yang, Gabriel Poesia, Jingxuan He, ..., Swarat Chaudhuri, Dawn Song
-
· (𝕏)
-
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs,
arXiv, 2412.03205
, arxiv, pdf, cication: -1Konstantin Chernyshev, Vitaliy Polshkov, Ekaterina Artemova, ..., Alexei Miasnikov, Sergei Tilga · (u-math - Toloka)
-
ProcessBench: Identifying Process Errors in Mathematical Reasoning,
arXiv, 2412.06559
, arxiv, pdf, cication: -1Chujie Zheng, Zhenru Zhang, Beichen Zhang, ..., Jingren Zhou, Junyang Lin
-
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI,
arXiv, 2411.04872
, arxiv, pdf, cication: -1Elliot Glazer, Ege Erdil, Tamay Besiroglu, ..., Tetiana Grechuk, Shreepranav Varma Enugandla · (epochai) · (𝕏)
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning,
arXiv, 2410.22304
, arxiv, pdf, cication: -1Yihe Deng, Paul Mineiro
-
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics,
arXiv, 2410.21272
, arxiv, pdf, cication: -1Yaniv Nikankin, Anja Reusch, Aaron Mueller, ..., Yonatan Belinkov · (x)
-
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models,
arXiv, 2410.07985
, arxiv, pdf, cication: -1Bofei Gao, Feifan Song, Zhe Yang, ..., Tianyu Liu, Baobao Chang · (Omni-MATH - KbsdJames)
· (omni-math.github) · (huggingface) · (huggingface)
-
There May Not be Aha Moment in R1-Zero-like Training — A Pilot Study
-
DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL
· (𝕏) · (deepscaler - agentica-project)
-
🌟 s1: Simple test-time scaling,
arXiv, 2501.19393
, arxiv, pdf, cication: -1Niklas Muennighoff, Zitong Yang, Weijia Shi, ..., Emmanuel Candès, Tatsunori Hashimoto · (s1 - simplescaling)
-
a fine-tuned version of Qwen/Qwen2.5-32B-Instruct on the Bespoke-Stratos-17k dataset. 🤗
-
🌟 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs,
arXiv, 2501.18585
, arxiv, pdf, cication: -1Yue Wang, Qiuzhi Liu, Jiahao Xu, ..., Haitao Mi, Dong Yu
-
🌟 simpleRL-reason - hkust-nlp
Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient · (hkust-nlp.notion)
-
🌟 open-r1 - huggingface
-
TinyZero - Jiayi-Pan
-
With R1, a lot of people have been asking “how come we didn't discover this 2 years ago?” 𝕏
-
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning,
arXiv, 2501.12570
, arxiv, pdf, cication: -1Haotian Luo, Li Shen, Haiying He, ..., Xiaochun Cao, Dacheng Tao · (O1-Pruner - StarDewXXX)
-
DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs
-
🌟 Meta Chain-of-Thought: Unlocking System 2 Reasoning in LLMs
· (𝕏)
-
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning,
arXiv, 2501.06458
, arxiv, pdf, cication: -1Zhongzhen Huang, Gui Geng, Shengyi Hua, ..., Pengfei Liu, Xiaofan Zhang
-
🌟 DeepSeek-R1 - deepseek-ai
-
Kimi-k1.5 - MoonshotAI
Scaling Reinforcement Learning with LLMs
-
🌟 Sky-T1: Train your own O1 preview model within $450
· (SkyThought - NovaSky-AI)
-
🌟 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking,
arXiv, 2501.04519
, arxiv, pdf, cication: -1Xinyu Guan, Li Lyna Zhang, Yifei Liu, ..., Fan Yang, Mao Yang · (rStar - microsoft)
-
🌟 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought,
arXiv, 2501.04682
, arxiv, pdf, cication: -1Violet Xiang, Charlie Snell, Kanishk Gandhi, ..., Nick Haber, Chelsea Finn
-
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models,
arXiv, 2501.03124
, arxiv, pdf, cication: -1Mingyang Song, Zhaochen Su, Xiaoye Qu, ..., Jiawei Zhou, Yu Cheng · (prmbench.github) · (PRMBench - ssmisya)
· (arxiv) · (huggingface)
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback,
arXiv, 2501.03916
, arxiv, pdf, cication: -1Jiakang Yuan, Xiangchao Yan, Botian Shi, ..., Yu Qiao, Bowen Zhou
-
Search-o1: Agentic Search-Enhanced Large Reasoning Models,
arXiv, 2501.05366
, arxiv, pdf, cication: -1Xiaoxi Li, Guanting Dong, Jiajie Jin, ..., Peitian Zhang, Zhicheng Dou · (Search-o1 - sunnynexus)
-
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models,
arXiv, 2412.15287
, arxiv, pdf, cication: 1Yinlam Chow, Guy Tennenholtz, Izzeddin Gur, ..., Aviral Kumar, Aleksandra Faust · (𝕏)
-
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs,
arXiv, 2412.21187
, arxiv, pdf, cication: -1Xingyu Chen, Jiahao Xu, Tian Liang, ..., Haitao Mi, Dong Yu
-
PRIME - PRIME-RL
· (curvy-check-498.notion) · (𝕏)
-
SmallThinker-3B-preview, a new model fine-tuned from the Qwen2.5-3b-Instruct model. 🤗
-
distill its thinking capacities into a smaller model, enhancing their reasoning performances 𝕏
· (t)
-
OpenAI o1 System Card,
arXiv, 2412.16720
, arxiv, pdf, cication: -1OpenAI, :, Aaron Jaech, ..., Zheng Shao, Zhuohan Li
-
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems,
arXiv, 2412.09413
, arxiv, pdf, cication: -1Yingqian Min, Zhipeng Chen, Jinhao Jiang, ..., Zhongyuan Wang, Ji-Rong Wen · (Slow_Thinking_with_LLMs - RUCAIBox)
-
🌟 search-and-learn - huggingface
· (huggingface) · (𝕏)
-
Beyond Decoding: Meta-Generation Algorithms for Large Language Models
· (cmu-l3.github)
-
uncensored version of Qwen/QwQ-32B-Preview created with abliteration 🤗
· (remove-refusals-with-transformers - Sumandora)
-
Free Process Rewards without Process Labels,
arXiv, 2412.01981
, arxiv, pdf, cication: -1Lifan Yuan, Wendi Li, Huayu Chen, ..., Zhiyuan Liu, Hao Peng
-
Natural Language Reinforcement Learning,
arXiv, 2411.14251
, arxiv, pdf, cication: -1Xidong Feng, Ziyu Wan, Haotian Fu, ..., Ying Wen, Jun Wang · (arxiv) · (Natural-language-RL - waterhorse1)
· (mp.weixin.qq)
-
Can we make any smaller opensource LLM models smarter than human?
-
· (bilibili)
-
Inference Scaling fLaws: The Limits of LLM Resampling with Imperfect Verifiers,
arXiv, 2411.17501
, arxiv, pdf, cication: -1Benedikt Stroebl, Sayash Kapoor, Arvind Narayanan · (𝕏) · (𝕏)
-
Open-O1 - Open-Source-O1
A Model Matching Proprietary Power with Open-Source Innovation
-
Patience Is The Key to Large Language Model Reasoning,
arXiv, 2411.13082
, arxiv, pdf, cication: -1Yijiong Yu
· (huggingface)
-
🌟 O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?,
arXiv, 2411.16489
, arxiv, pdf, cication: -1Zhen Huang, Haoyang Zou, Xuefeng Li, ..., Weizhe Yuan, Pengfei Liu · (O1-Journey - GAIR-NLP)
-
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision,
arXiv, 2411.16579
, arxiv, pdf, cication: -1Zhiheng Xi, Dingwen Yang, Jixuan Huang, ..., Xuanjing Huang, Yu-Gang Jiang · (mathcritique.github)
-
QwQ: Reflect Deeply on the Boundaries of the Unknown
· (huggingface)
-
🌟 O1-Journey - GAIR-NLP
-
Beyond Decoding: Meta-Generation Algorithms for Large Language Models
· (simons.berkeley)
-
🌟 From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models,
arXiv, 2406.16838
, arxiv, pdf, cication: -1Sean Welleck, Amanda Bertsch, Matthew Finlayson, ..., Ilia Kulikov, Zaid Harchaoui · (cmu-l3.github)
-
DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 𝕏
· (t)
-
🌟 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions,
arXiv, 2411.14405
, arxiv, pdf, cication: -1Yu Zhao, Huifeng Yin, Bo Zeng, ..., Weihua Luo, Kaifu Zhang · (Marco-o1 - AIDC-AI)
-
🌟 entropix - xjdr-alt
-
Thinking-Claude - richards199999
-
LLaMA-O1 - SimpleBerry
Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace · (qbitai)
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model,
arXiv, 2410.13639
, arxiv, pdf, cication: -1Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu
-
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces,
arXiv, 2410.09918
, arxiv, pdf, cication: -1DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, ..., Yuandong Tian, Qinqing Zheng
-
O1-Journey - GAIR-NLP
A Strategic Progress Report
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model,
arXiv, 2410.13639
, arxiv, pdf, cication: -1Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu
-
Disentangling Memory and Reasoning Ability in Large Language Models,
arXiv, 2411.13504
, arxiv, pdf, cication: -1Mingyu Jin, Weidi Luo, Sitao Cheng, ..., William Yang Wang, Yongfeng Zhang · (Disentangling-Memory-and-Reasoning - MingyuJ666)
-
ProgCo: Program Helps Self-Correction of Large Language Models,
arXiv, 2501.01264
, arxiv, pdf, cication: -1Xiaoshuai Song, Yanan Wu, Weixun Wang, ..., Wenbo Su, Bo Zheng
-
🌟 Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization,
arXiv, 2412.18525
, arxiv, pdf, cication: -1Yang Shen, Xiu-Shen Wei, Yifan Sun, ..., Yazhou Yao, Errui Ding
-
The broader spectrum of in-context learning,
arXiv, 2412.03782
, arxiv, pdf, cication: -1Andrew Kyle Lampinen, Stephanie C. Y. Chan, Aaditya K. Singh, ..., Murray Shanahan · (𝕏)
-
🌟 Demystifying Long Chain-of-Thought Reasoning in LLMs,
arXiv, 2502.03373
, arxiv, pdf, cication: -1Edward Yeo, Yuxuan Tong, Morry Niu, ..., Graham Neubig, Xiang Yue · (𝕏) · (demystify-long-cot - eddycmu)
-
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step,
arXiv, 2501.13926
, arxiv, pdf, cication: -1Ziyu Guo, Renrui Zhang, Chengzhuo Tong, ..., Hongsheng Li, Pheng-Ann Heng · (Image-Generation-CoT - ZiyuGuo99)
-
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model,
arXiv, 2501.07246
, arxiv, pdf, cication: -1Ziyang Ma, Zhuo Chen, Yuping Wang, ..., Eng Siong Chng, Xie Chen
-
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning,
arXiv, 2409.12183
, arxiv, pdf, cication: 24Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, ..., Kyle Mahowald, Greg Durrett · (To-CoT-or-not-to-CoT - Zayne-sprague)
· (𝕏)
-
Internalize_CoT_Step_by_Step - da03
· (huggingface)
-
LLMs Do Not Think Step-by-step In Implicit Reasoning,
arXiv, 2411.15862
, arxiv, pdf, cication: -1Yijiong Yu
· (𝕏)
-
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration,
arXiv, 2410.16540
, arxiv, pdf, cication: -1Yingqian Cui, Pengfei He, Xianfeng Tang, ..., Jiliang Tang, Yue Xing · (𝕏)
-
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse,
arXiv, 2410.21333
, arxiv, pdf, cication: -1Ryan Liu, Jiayi Geng, Addison J. Wu, ..., Tania Lombrozo, Thomas L. Griffiths
-
Does Prompt Formatting Have Any Impact on LLM Performance?,
arXiv, 2411.10541
, arxiv, pdf, cication: -1Jia He, Mukund Rungta, David Koleczek, ..., Franklin X Wang, Sadid Hasan
-
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs,
arXiv, 2410.12405
, arxiv, pdf, cication: -1Jingming Zhuo, Songyang Zhang, Xinyu Fang, ..., Dahua Lin, Kai Chen · (ProSA - open-compass)
-
Open-Reasoning-Tasks - NousResearch
-
prompt-poet - character-ai
-
V0-system-prompt - 2-fly-4-ai
· (reddit)
-
Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge,
arXiv, 2501.18099
, arxiv, pdf, cication: -1Swarnadeep Saha, Xian Li, Marjan Ghazvininejad, ..., Jason Weston, Tianlu Wang · (𝕏)
-
Dynamic Planning with a LLM,
arXiv, 2308.06391
, arxiv, pdf, cication: -1Gautier Dagan, Frank Keller, Alex Lascarides
-
On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark),
arXiv, 2302.06706
, arxiv, pdf, cication: -1Karthik Valmeekam, Sarath Sreedharan, Matthew Marquez, ..., Alberto Olmo, Subbarao Kambhampati
-
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement,
arXiv, 2310.08559
, arxiv, pdf, cication: -1Linlu Qiu, Liwei Jiang, Ximing Lu, ..., Nouha Dziri, Xiang Ren
-
Revealing the Barriers of Language Agents in Planning,
arXiv, 2410.12409
, arxiv, pdf, cication: 1Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao
Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao