Yuchen Yan1,2,*,
Yongliang Shen1,†,
Yang Liu 2,
Jin Jiang 2,3,
Mengdi zhang 2,
Jian Shao1,†,
Yueting Zhuang1
1Zhejiang University
2Meituan Group
3Peking university
Preprint. Under review.
*Contribution during internship at Meituan Group, †Corresponding Author
- 2025.07.12: We release our re-implemented dataset.
- 2025.05.24: We release our HomePage and Code examples.
- 2025.03.09: We release our paper.
In this paper, we propose a fundamentally different approach to long-context reasoning. Rather than viewing reasoning as a single extended process, we introduce InftyThink, a novel paradigm that divides complex reasoning into multiple interrelated short reasoning segments. Each segment remains within a computationally efficient context length while maintaining the coherent flow of thought across iterations. This approach draws inspiration from human cognitive processes, where complex problem-solving frequently involves breaking problems into manageable parts and summarizing intermediate progress.
Our contributions can be summarized as follows:
- We introduce InftyThink, which transforms monolithic long-form reasoning into iterative reasoning with summarization, mimicking human working memory patterns and reducing the quadratic computational complexity of transformer-based models to a more manageable form.
- We develop a technique to reconstruct existing long-context reasoning datasets (demonstrated on OpenR1-Math) into our iterative format, preserving reasoning quality while enabling more efficient computation without specialized architectures.
- Across multiple model architectures, our approach achieves significant improvements while substantially reducing computational costs, challenging the assumed trade-off between reasoning depth and efficiency.
InftyThink
├── data_preprocess # Generate InftyThink-style data
├── inference # An example for using InftyThink-style models
├── docs
└── readme.md
cd data_preprocess
python3 segmentation.py --dataset_name open-r1/OpenR1-Math-220k \
--tokenizer Qwen/Qwen2.5-Math-7B \
--eta 4096cd data_preprocess
python3 generate_data.py --model meta-llama/Llama-3.3-70B-InstructAfter code finished, InftyThink-style data is available.
We provide an example for InftyThink-style reasoning, after your SFT on InftyThink-style data, feel free to try it!
cd inference
python3 infer_single.pyIf you find our work helpful, feel free to give us a cite.
@misc{yan2025inftythink,
title={InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models},
author={Yuchen Yan and Yongliang Shen and Yang Liu and Jin Jiang and Mengdi Zhang and Jian Shao and Yueting Zhuang},
year={2025},
eprint={2503.06692},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.06692},
}
If you have any questions, please contact us by email: yanyuchen@zju.edu.cn


