Awesome papers in machine learning theory & deep learning theory.
-
Understanding Machine Learning: From Theory to Algorithms.
- Year 2014.
- Shai Shalev-Shwartz, Shai Ben-David.
- book
-
Foundations of Machine Learning.
- Year 2018.
- Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.
- book
-
Learning Theory from First Principles.
- Year 2021.
- Francis Bach.
- book
-
Learnability and the Vapnik-Chervonenkis Dimension.
- Journal of the Association for Computing Machinery 1989.
- Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, Manfred M. Warmuth.
- paper
-
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension.
- Machine Learning 1995.
- Sally Floyd, Manfred Warmuth.
- paper
-
Characterizations of Learnability for Classes of
${0, \dots, n}$ -valued Functions.- Journal of Computer and System Sciences 1995.
- Shai Ben-David, Nicolo Cesa-Bianchi, David Haussler, Philip M. Long.
- paper
-
Scale-Sensitive Dimensions, Uniform Convergence, and Learnability.
- JACM 1997.
- Noga Alon, Shai Ben-David, Nicolo Cesa-Bianchi, David Haussler.
- paper
-
Regret Bounds for Prediction Problems.
- COLT 1999.
- Geoffrey J. Gordon.
- paper
-
A Study About Algorithmic Stability and Their Relation to Generalization Performances.
- Technical Report 2000.
- Andre Elissee.
- paper
-
Algorithmic Stability and Generalization Performance.
- NIPS 2001.
- Olivier Bousquet, Andre Elisseeff.
- paper
-
A Generalized Representer Theorem.
- COLT 2001.
- Bernhard Scholkopf, Ralf Herbrich, Alex J. Smola.
- paper
-
Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms.
- Phd Thesis 2002.
- Olivier Bousquet.
- paper
-
Rademacher and Gaussian Complexities: Risk Bounds and Structural Results.
- JMLR 2002.
- Peter L. Bartlett, Shahar Mendelson.
- paper
-
Stability and Generalization.
- JMLR 2002.
- Olivier Bousquet, Andre Elisseeff.
- paper
-
Almost-Everywhere Algorithmic Stability and Generalization Error.
- UAI 2002.
- Samuel Kutin, Partha Niyogi.
- paper
-
PAC-Bayes & Margins.
- NIPS 2003.
- John Langford, John Shawe-Taylor.
- paper
-
Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization.
- Annals of Statistics 2004.
- Tong Zhang.
- paper
-
Theory of Classification: A Survey of Some Recent Advances.
- ESAIM: Probability and Statistics 2005.
- Stephane Boucheron, Olivier Bousquet, Gabor Lugosi.
- paper
-
Learning Theory: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization.
- Advances in Computational Mathematics 2006.
- Sayan Mukherjee, Partha Niyogi, Tomaso Poggio, Ryan Rifkin.
- paper
-
Tutorial on Practical Prediction Theory for Classification.
- JMLR 2006.
- John Langford.
- paper
-
Rademacher Complexity Bounds for Non-I.I.D. Processes.
- NIPS 2008.
- Mehryar Mohri, Afshin Rostamizadeh.
- paper
-
On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization.
- NIPS 2008.
- Sham M. Kakade, Karthik Sridharan, and Ambuj Tewari.
- paper
-
Agnostic Online Learning.
- COLT 2009.
- Shai Ben-David, David Pal, Shai Shalev-Shwartz.
- paper
-
Learnability, Stability and Uniform Convergence.
- JMLR 2010.
- Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, Karthik Sridharan.
- paper
-
Multiclass Learnability and the ERM Principle.
- COLT 2011.
- Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz.
- paper
-
Algorithmic Stability and Hypothesis Complexity.
- ICML 2017.
- Tongliang Liu, Gábor Lugosi, Gergely Neu, Dacheng Tao.
- paper
-
Stability and Generalization of Learning Algorithms that Converge to Global Optima.
- ICML 2018.
- Zachary Charles, Dimitris Papailiopoulos.
- paper
-
Generalization Bounds for Uniformly Stable Algorithms.
- NIPS 2018.
- Vitaly Feldman, Jan Vondrak.
- paper
-
Reconciling Modern Machine Learning Practice and the Bias-Variance Trade-Off.
- arXiv 2019.
- Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal.
- paper
-
Sharper Bounds for Uniformly Stable Algorithms.
- COLT 2020.
- Olivier Bousquet, Yegor Klochkov, Nikita Zhivotovskiy.
- paper
-
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures.
- NeurIPS 2021.
- Yuan Cao, Quanquan Gu, Mikhail Belkin.
- paper
- The Modern Mathematics of Deep Learning.
- arXiv 2021.
- Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen.
- book
-
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective.
- arXiv 2019.
- Guan-Horng Liu, Evangelos A. Theodorou.
- paper
-
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation.
- arXiv 2019.
- Greg Yang
- paper
-
Regularization Algorithms for Learning that are Equivalent to Multilayer Networks.
- Science 1990.
- T. Poggio, F. Girosi.
- paper
-
Strong Universal Consistency of Neural Network Classifiers.
- IEEE Transactions on Information Theory 1993.
- AndrAs Farag, GAbor Lugosi.
- paper
-
For Valid Generalization, the Size of the Weights is More Important Than the Size of the Network.
- NIPS 1996.
- Peter L. Bartlett.
- paper
-
Benefits of Depth in Neural Networks.
- COLT 2016.
- Matus Telgarsky.
- paper
-
Understanding Deep Learning Requires Rethinking Generalization.
- ICLR 2017.
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals.
- paper
-
Convergence Analysis of Two-layer Neural Networks with ReLU Activation.
- arXiv 2017.
- Yuanzhi Li, Yang Yuan.
- paper
-
Neural Tangent Kernel: Convergence and Generalization in Neural Networks.
- NeurIPS 2018.
- Arthur Jacot, Franck Gabriel, Clément Hongler.
- paper
-
PAC-Bayesian Margin Bounds for Convolutional Neural Networks.
-
To Understand Deep Learning We Need to Understand Kernel Learning.
- ICML 2018.
- Mikhail Belkin, Siyuan Ma, Soumik Mandal.
- paper
-
The Vapnik–Chervonenkis Dimension of Graph and Recursive Neural Networks.
- ML 2018.
- Franco Scarselli, Ah Chung Tsoi, Markus Hagenbuchner.
- paper
-
Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks.
- arXiv 2019.
- Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma.
- paper
-
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers.
- NeurIPS 2019.
- Zeyuan Allen-Zhu, Yuanzhi Li, Yingyu Liang.
- paper
-
Deep Learning Generalizes Because the Parameter-Function Map is Biased Towards Simple Functions.
- ICLR 2019.
- Guillermo Valle Pérez, Chico Q. Camargo, Ard A. Louis.
- paper
-
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent.
- arXiv 2020.
- David Holzmuller, Ingo Steinwart.
- paper
-
On the Distance Between Two Neural Networks and the Stability of Learning.
- Neurips 2021.
- Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu.
- paper
-
Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets.
- arXiv 2021.
- Shao-Bo Lin, Yao Wang, Ding-Xuan Zhou.
- paper
-
Learnability of Convolutional Neural Networks for Infinite Dimensional Input via Mixed and Anisotropic Smoothness.
- ICLR 2022.
- Sho Okumoto, Taiji Suzuki.
- paper