Awesome-Weak-to-Strong-Generalization(W2S)

All the papers listed in this project come from my usual reading. If you have found some new and interesting papers, I would appreciate it if you let me know!!!

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision: https://arxiv.org/abs/2312.09390
Weak-to-Strong Reasoning: https://arxiv.org/abs/2407.13647
Debating with More Persuasive LLMs Leads to More Truthful Answers: https://arxiv.org/abs/2402.06782
CriticGPT: https://openai.com/index/finding-gpt4s-mistakes-with-gpt-4/
Aligner: Efficient Alignment by Learning to Correct: https://arxiv.org/abs/2402.02416
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks: https://arxiv.org/abs/2401.06751
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision: https://arxiv.org/abs/2403.09472
Self-playing Adversarial Language Game Enhances LLM Reasoning: https://arxiv.org/abs/2404.10642
Theoretical Analysis of Weak-to-Strong Generalization: https://arxiv.org/abs/2405.16043
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models: https://arxiv.org/abs/2402.03749
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts: https://arxiv.org/abs/2402.15505
Quantifying the Gain in Weak-to-Strong Generalization: https://arxiv.org/abs/2405.15116
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge: https://arxiv.org/abs/2407.19594
Optimizing Language Model's Reasoning Abilities with Weak Supervision: https://arxiv.org/abs/2405.04086
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment: https://arxiv.org/abs/2405.17888
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization: https://arxiv.org/abs/2406.11431
LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement: https://arxiv.org/abs/2407.00497
Bayesian WeakS-to-Strong from Text Classification to Generation: https://arxiv.org/abs/2406.03199
Transcendence: Generative Models Can Outperform The Experts That Train Them: https://arxiv.org/abs/2406.11741
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models: https://arxiv.org/abs/2405.19262
On Scalable Oversight with Weak LLMs Judging Strong LLMs: https://arxiv.org/abs/2407.04622

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Weak-to-Strong-Generalization(W2S)

About

Releases

Packages

Contributors 2

SnoopX-AI/Awesome-Weak-to-Strong-Generalization

Folders and files

Latest commit

History

Repository files navigation

Awesome-Weak-to-Strong-Generalization(W2S)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages