Advanced Deep Learning @ KAIST

Course Information

Instructor: Sung Ju Hwang (sjhwang82@kaist.ac.kr)
TAs: Seul Lee (animecult@kaist.ac.kr) and Jaehyeong Jo (harryjo97@kaist.ac.kr)

Office: This is an on/offline hybrid course. Building Nubmer 9, Room 9201 (Instructor) 2nd floor (TAs)
Office hours: By appointment only.

Grading Policy

Absolute Grading
Paper Presentation: 20%
Attendance and Participation: 20%
Project: 60%

Tentative Schedule

Dates	Topic
8/29	Course Introduction
9/1	Review of Deep Learning Basics (Video Lecture)
9/6	Vision Transformers (Lecture)
9/8	Vision Transformers / Self-Supervised Learning (Lecture)
9/13	Self-Supervised Learning (Lecture)
9/15	Self-Supervised Learning (Presentation)
9/20	Bayesian Deep Learning - Bayesian ML Basics, Bayesian Neural Networks (Lecture)
9/22	Bayesian Deep Learning - Bayesian Approximations, Uncertainties in Prediction (Lecture)
9/27	Bayesian Deep Learning - MCMC Sampling for Bayesian Inference, Neural Processes (Lecture)
9/29	Bayesian Deep Learning (Presentation)
10/4	Deep Generative Models - Advanced GANs (Lecture)
10/6	Deep Generative Models - Advanced GANs (Presentation) Initial Proposal Due
10/11	Deep Generative Models - Diffusion Models (Lecture)
10/13	Deep Generative Models - Diffusion Models (Lecture)
10/18	Deep Generative Models - Diffusion Models (Presentation)
10/20	Mid-term Presentation
10/25	Large Language Models (Lecture)
10/27	Multimodal Generative Models (Lecture)
11/1	Large Language Models and Multimodal Generative Models (Presentation)
11/3	Deep Reinforcement Learning - Deep RL Basics (Lecture)
11/8	Deep Reinforcement Learning - Policy-based RL, Model-based RL (Lecture)
11/10	Deep Reinforcement Learning - Offline RL, Exploration, RL via Sequence Modeling (Lecture)
11/15	Deep Reinforcement Learning (Presentation)
11/17	Meta Learning (Lecture)
11/22	Meta Learning (Presentation)
11/24	Continual Learning (Lecture)
11/29	Continual Learning (Presentation)
12/1	Robust Deep Learning (Lecture)
12/6	Robust Deep Learning (Presentation)
12/8	Deep Graph Learning (Lecture)
12/13	Deep Graph Learning (Presentation)
12/15	Final Presentation

Reading List

Vision Transformers

[Dosovitskiy et al. 21] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR 2021.
[Touvron et al. 21] Training Data-efficient Image transformers & Distillation through Attention, ICML 2021.
[Liu et al. 21] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, ICCV 2021.
[Wu et al. 21] CvT: Introducing Convolutions to Vision Transformers, ICCV 2021.
[Dai et al. 21] CoAtNet: Marrying Convolution and Attnetion for All Data Sizes, NeurIPS 2021.
[Yang et al. 21] Focal Attention for Long-Range Interactions in Vision Transformers, NeurIPS 2021.
[El-Nouby et al. 21] XCiT: Cross-Covariance Image Transformers, NeurIPS 2021.
[Li et al. 22] MViTv2: Improved Multiscale Vision Transformers for Classification and Detection, CVPR 2022.
[Lee et al. 22] MPViT : Multi-Path Vision Transformer for Dense Prediction, CVPR 2022.
[Liu et al. 22]A ConvNet for the 2020s, CVPR 2022.

Self-Supervised Learning

[Dosovitskiy et al. 14] Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS 2014.
[Pathak et al. 16] Context Encoders: Feature Learning by Inpainting, CVPR 2016.
[Norrozi and Favaro et al. 16] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, ECCV 2016.
[Gidaris et al. 18] Unsupervised Representation Learning by Predicting Image Rotations, ICLR 2018.
[He et al. 20] Momentum Contrast for Unsupervised Visual Representation Learning, CVPR 2020.
[Chen et al. 20] A Simple Framework for Contrastive Learning of Visual Representations, ICML 2020.
[Mikolov et al. 13] Efficient Estimation of Word Representations in Vector Space, ICLR 2013.
[Devlin et al. 19] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019.
[Clark et al. 20] ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, ICLR 2020.
[Hu et al. 20] Strategies for Pre-training Graph Neural Networks, ICLR 2020.
[Chen et al. 20] Generative Pretraining from Pixels, ICML 2020.
[Laskin et al. 20] CURL: Contrastive Unsupervised Representations for Reinforcement Learning, ICML 2020.
[Grill et al. 20] Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, NeurIPS 2020.
[Chen et al. 20] Big Self-Supervised Models are Strong Semi-Supervised Learners, NeurIPS, 2020.
[Chen and He. 21] Exploring Simple Siamese Representation Learning, CVPR 2021.
[Tian et al. 21] Understanding Self-Supervised Learning Dynamics without Contrastive Pairs, ICML 2021.
[Caron et al. 21] Emerging Properties in Self-Supervised Vision Transformers, ICCV 2021.

[Liu et al. 22] Self-supervised Learning is More Robust to Dataset Imbalance, ICLR 2022.
[Bao et al. 22] BEiT: BERT Pre-Training of Image Transformers, ICLR 2022.
[He et al. 22] Masked Autoencoders are Scalable Vision Learners, CVPR 2022.
[Liu et al. 22] Improving Contrastive Learning with Model Augmetnation, arXiv preprint, 2022.
[Touvron et al. 22] DeIT III: Revenge of the VIT, arXiv preprint, 2022.

Bayesian Deep Learning

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.
[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.
[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.
[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.
[Liu et al. 16] Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm, NIPS 2016.
[Mandt et al. 17] Stochastic Gradient Descent as Approximate Bayesian Inference, JMLR 2017.
[Kendal and Gal 17] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.
[Gal et al. 17] Concrete Dropout, NIPS 2017.
[Gal et al. 17] Deep Bayesian Active Learning with Image Data, ICML 2017.
[Teye et al. 18] Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.
[Garnelo et al. 18] Conditional Neural Process, ICML 2018.
[Kim et al. 19] Attentive Neural Processes, ICLR 2019.
[Sun et al. 19] Functional Variational Bayesian Neural Networks, ICLR 2019.
[Louizos et al. 19] The Functional Neural Process, NeurIPS 2019.
[Zhang et al. 20] Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning, ICLR 2020.
[Amersfoort et al. 20] Uncertainty Estimation Using a Single Deep Deterministic Neural Network, ICML 2020.
[Dusenberry et al. 20] Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors, ICML 2020.
[Wenzel et al. 20] How Good is the Bayes Posterior in Deep Neural Networks Really?, ICML 2020.
[Lee et al. 20] Bootstrapping Neural Processes, NeurIPS 2020.
[Wilson et al. 20] Bayesian Deep Learning and a Probabilistic Perspective of Generalization, NeurIPS 2020.
[Izmailov et al. 21] What Are Bayesian Neural Network Posteriors Really Like?, ICML 2021.
[Daxberger et al. 21] Bayesian Deep Learning via Subnetwork Inference, ICML 2021.

[Fortuin et al. 22] Bayesian Neural Network Priors Revisited, ICLR 2022.
[Muller et al. 22] Transformers Can Do Bayesian Inference, ICLR 2022.
[Nguyen and Grover 22] Transformer Neural Processes, ICML 2022.
[Nazaret and Blei 22] Variational Inference for Infinitely Deep Neural Networks, ICML 2022.
[Lotfi et al. 22] Bayesian Model Selection, the Marginal Likelihood, and Generalization, ICML 2022.
[Alexos et al. 22] Structured Stochastic Gradient MCMC, ICML 2022.

Deep Generative Models

VAEs, Autoregressive and Flow-Based Generative Models

[Rezende and Mohamed 15] Variational Inference with Normalizing Flows, ICML 2015.
[Germain et al. 15] MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.
[Kingma et al. 16] Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.
[Oord et al. 16] Pixel Recurrent Neural Networks, ICML 2016.
[Dinh et al. 17] Density Estimation Using Real NVP, ICLR 2017.
[Papamakarios et al. 17] Masked Autoregressive Flow for Density Estimation, NIPS 2017.
[Huang et al.18] Neural Autoregressive Flows, ICML 2018.
[Kingma and Dhariwal 18] Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.
[Ho et al. 19] Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.
[Chen et al. 19] Residual Flows for Invertible Generative Modeling, NeurIPS 2019.
[Tran et al. 19] Discrete Flows: Invertible Generative Models of Discrete Data, NeurIPS 2019.
[Ping et al. 20] WaveFlow: A Compact Flow-based Model for Raw Audio, ICML 2020.
[Vahdat and Kautz 20] NVAE: A Deep Hierarchical Variational Autoencoder, NeurIPS 2020.
[Ho et al. 20] Denoising Diffusion Probabilistic Models, NeurIPS 2020.
[Song et al. 21] Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021.
[Kosiorek et al. 21] NeRF-VAE: A Geometry Aware 3D Scene Generative Model, ICML 2021.

Generative Adversarial Networks

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.
[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.
[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.
[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.
[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.
[Zhang et al. 17] Adversarial Feature Matching for Text Generation, ICML 2017.
[Karras et al. 18] Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.
[Choi et al. 18] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, CVPR 2018.
[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.
[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.
[Karras et al. 20] Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.
[Sinha et al. 20] Small-GAN: Speeding up GAN Training using Core-Sets, ICML 2020.
[Karras et al. 20] Training Generative Adversarial Networks with Limited Data, NeurIPS 2020.
[Liu et al. 21] Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis, ICLR 2021.
[Esser et al. 22] Taming Transformers for High-Resolution Image Synthesis, CVPR 2021.
[Hudson and Zitnick 21] Generative Adversarial Transformers, ICML 2021.
[Karras et al. 21] Alias-Free Generative Adversarial Networks, NeurIPS 2021.

[Skorokhodov et al. 22] StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2, CVPR 2022.
[Lin et al. 22] InfinityGAN: Towards Infinite-Pixel Image Synthesis, ICLR 2022.
[Lee et al. 22] ViTGAN: Training GANs with Vision Transformers, ICLR 2022.
[Yu et al. 22] Vector-Quantized Image Modeling with Improved VQGAN, ICLR 2022.
[Franceschi et al. 22] A Neural Tangent Kernel Perspective of GANs, ICML 2022.

Diffusion Models

[Song and Ermon 19] Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS 2019.
[Song and Ermon 20] Improved Techniques for Training Score-Based Generative Models, NeurIPS 2020.
[Ho et al. 20] Denoising Diffusion Probabilistic Models, NeurIPS 2020.
[Song et al. 21] Score-Based Generative Modeling through Stochastic Differential Equations, ICLR 2021.
[Nichol and Dhariwal 21] Improved Denoising Diffusion Probabilistic Models, ICML 2021.
[Vahdat et al. 21] Score-based Generative Modeling in Latent Space, NeurIPS 2021.
[Dhariwal and Nichol 21] Diffusion Models Beat GANs on Image Synthesis, NeureIPS 2021.
[De Bortoli et al. 22] Diffusion Schrodinger Bridge with Application to Score-Based Generative Modeling, NeurIPS 2021.
[Ho and Salimans 22] Classifier-Free Diffusion Guidance, arXiv preprint, 2022.

[Dockhorn et al. 22] Score-Based Generative Modeling with Critically-Damped Langevin Diffusion, ICLR 2022.
[Salimans and Ho 22] Progressive Distillation for Fast Sampling of Diffusion Models, ICLR 2022.
[Chen et al. 22] Likelihood Training of Schrodinger Bridge using Forward-Backwrad SDEs Theory, ICLR 2022.

Deep Reinforcement Learning

[Mnih et al. 13] Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.
[Silver et al. 14] Deterministic Policy Gradient Algorithms, ICML 2014.
[Schulman et al. 15] Trust Region Policy Optimization, ICML 2015.
[Lillicrap et al. 16] Continuous Control with Deep Reinforcement Learning, ICLR 2016.
[Schaul et al. 16] Prioritized Experience Replay, ICLR 2016.
[Wang et al. 16] Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.
[Mnih et al. 16] Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.
[Schulman et al. 17] Proximal Policy Optimization Algorithms, arXiv preprint, 2017.
[Nachum et al. 18] Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.
[Ha et al. 18] Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.
[Burda et al. 19] Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.
[Vinyals et al. 19] Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.
[Bellemare et al. 19] A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.
[Janner et al. 19] When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.
[Fellows et al. 19] VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.
[Kumar et al. 19] Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction, NeurIPS 2019.
[Kaiser et al. 20] Model Based Reinforcement Learning for Atari, ICLR 2020.
[Agarwal et al. 20] An Optimistic Perspective on Offline Reinforcement Learning, ICML 2020.
[Lee et al. 20] Batch Reinforcement Learning with Hyperparameter Gradients, ICML 2020.
[Kumar et al. 20] Conservative Q-Learning for Offline Reinforcement Learning, ICML 2020.
[Yarats et al. 21] Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels, ICLR 2021.
[Chen et al. 21] Decision Transformer: Reinforcement Learning via Sequence Modeling, NeurIPS 2021.

[Mai et al. 22] Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation, ICLR 2022.
[Furuta et al. 22] Generalized Decision Transformer for Offline Hindsight Information Matching, ICLR 2022.
[Oh et al. 22] Model-augmented Prioritized Experience Replay, ICLR 2022.
[Rengarajan et al. 22] Reinforcement Learning with Sparse Rewards Using Guidance from Offline Demonstration, ICLR 2022.
[Patil et al. 22] Align-RUDDER: Learning from Few Demonstrations by Reward Redistribution, ICML 2022.
[Goyal et al. 22] Retrieval Augmented Reinforcement Learning, ICML 2022.
[Reed et al. 22] A Generalist Agent, arXiv preprint, 2022.

Memory and Computation-Efficient Deep Learning

[Han et al. 15] Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.
[Wen et al. 16] Learning Structured Sparsity in Deep Neural Networks, NIPS 2016
[Han et al. 16] Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016
[Molchanov et al. 17] Variational Dropout Sparsifies Deep Neural Networks, ICML 2017
[Luizos et al. 17] Bayesian Compression for Deep Learning, NIPS 2017.
[Luizos et al. 18] Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.
[Howard et al. 18] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CVPR 2018.
[Frankle and Carbin 19] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.
[Lee et al. 19] SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.
[Liu et al. 19] Rethinking the Value of Network Pruning, ICLR 2019.
[Jung et al. 19] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.
[Morcos et al. 19] One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.
[Renda et al. 20] Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.
[Frankle et al. 20] Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020.
[Tanaka et al. 20] Pruning Neural Networks without Any Data by Iteratively Conserving Synaptic Flow, NeurIPS 2020.
[van Baalen et al. 20] Bayesian Bits: Unifying Quantization and Pruning, NeurIPS 2020.
[de Jorge et al. 21] Progressive Skeletonization: Trimming more fat from a network at initialization, ICLR 2021.
[Stock et al. 21] Training with Quantization Noise for Extreme Model Compression, ICLR 2021.
[Lee et al. 21] Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization, ICCV 2021.

Meta Learning

[Santoro et al. 16] Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
[Vinyals et al. 16] Matching Networks for One Shot Learning, NIPS 2016
[Edwards and Storkey 17] Towards a Neural Statistician, ICLR 2017
[Finn et al. 17] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017
[Snell et al. 17] Prototypical Networks for Few-shot Learning, NIPS 2017.
[Nichol et al. 18] On First-Order Meta-learning Algorithms, arXiv preprint, 2018.
[Lee and Choi 18] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.
[Liu et al. 19] Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019.
[Gordon et al. 19] Meta-Learning Probabilistic Inference for Prediction, ICLR 2019.
[Ravi and Beatson 19] Amortized Bayesian Meta-Learning, ICLR 2019.
[Rakelly et al. 19] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.
[Shu et al. 19] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS 2019.
[Finn et al. 19] Online Meta-Learning, ICML 2019.
[Lee et al. 20] Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020.
[Yin et al. 20] Meta-Learning without Memorization, ICLR 2020.
[Raghu et al. 20] Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML, ICLR 2020.
[Iakovleva et al. 20] Meta-Learning with Shared Amortized Variational Inference, ICML 2020.
[Bronskill et al. 20] TaskNorm: Rethinking Batch Normalization for Meta-Learning, ICML 2020.
[Rajendran et al. 20] Meta-Learning Requires Meta-Augmentation, NeurIPS 2020.
[Lee et al. 21] Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning, ICLR 2021.
[Shin et al. 21] Large-Scale Meta-Learning with Continual Trajectory Shifting, ICML 2021.
[Acar et al. 21] Memory Efficient Online Meta Learning, ICML 2021.

[Lee et al. 22] Online Hyperparameter Meta-Learning with Hypergradient Distillation, ICLR 2022.
[Flennerhag et al. 22] Boostrapped Meta-Learning, ICLR 2022.
[Yao et al. 22] Meta-Learning with Fewer Tasks through Task Interpolation, ICLR 2022.
[Guan and Lu 22] Task Relatedness-Based Generalization Bounds for Meta Learning, ICLR 2022.

Continual Learning

[Rusu et al. 16] Progressive Neural Networks, arXiv preprint, 2016
[Kirkpatrick et al. 17] Overcoming catastrophic forgetting in neural networks, PNAS 2017
[Lee et al. 17] Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017
[Shin et al. 17] Continual Learning with Deep Generative Replay, NIPS 2017.
[Lopez-Paz and Ranzato 17] Gradient Episodic Memory for Continual Learning, NIPS 2017.
[Yoon et al. 18] Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.
[Nguyen et al. 18] Variational Continual Learning, ICLR 2018.
[Schwarz et al. 18] Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.
[Chaudhry et al. 19] Efficient Lifelong Learning with A-GEM, ICLR 2019.
[Rao et al. 19] Continual Unsupervised Representation Learning, NeurIPS 2019.
[Rolnick et al. 19] Experience Replay for Continual Learning, NeurIPS 2019.
[Jerfel et al. 20] Reconciling Meta-Learning and Continual Learning with Online Mixtures of Tasks, NeurIPS 2019.
[Yoon et al. 20] Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.
[Remasesh et al. 20] Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics, Continual Learning Workshop, ICML 2020.
[Borsos et al. 20] Coresets via Bilevel Optimization for Continual Learning and Streaming, NeurIPS 2020.
[Mirzadeh et al. 20] Understanding the Role of Training Regimes in Continual Learning, NeurIPS 2020.
[Saha et al. 21] Gradient Projection Memory for Continual Learning, ICLR 2021.
[Veinat et al. 21] Efficient Continual Learning with Modular Networks and Task-Driven Priors, ICLR 2021.

[Madaan et al. 22] Representational Continuity for Unsupervised Continual Learning, ICLR 2022.
[Yoon et al. 22] Online Coreset Selection for Rehearsal-based Continual Learning, ICLR 2022.
[Lin et al. 22] TRGP: Trust Region Gradient Projection for Continual Learning, ICLR 2022.
[Wang et al. 22] Improving Task-free Continual Learning by Distributionally Robust Memory Evolution, ICML 2022.
[Kang et al. 22] Forget-free Continual Learning with Winning Subnetworks, ICML 2022.

Interpretable Deep Learning

[Ribeiro et al. 16] "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016
[Choi et al. 16] RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016
[Koh et al. 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
[Bau et al. 17] Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017
[Selvaraju et al. 17] Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.
[Kim et al. 18] Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.
[Heo et al. 18] Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.
[Bau et al. 19] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.
[Ghorbani et al. 19] Towards Automatic Concept-based Explanations, NeurIPS 2019.
[Coenen et al. 19] Visualizing and Measuring the Geometry of BERT, NeurIPS 2019.
[Heo et al. 20] Cost-Effective Interactive Attention Learning with Neural Attention Processes, ICML 2020.
[Agarwal et al. 20] Neural Additive Models: Interpretable Machine Learning with Neural Nets, arXiv preprint, 2020.

Reliable Deep Learning

[Guo et al. 17] On Calibration of Modern Neural Networks, ICML 2017.
[Lakshminarayanan et al. 17] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.
[Liang et al. 18] Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.
[Lee et al. 18] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.
[Kuleshov et al. 18] Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.
[Jiang et al. 18] To Trust Or Not To Trust A Classifier, NeurIPS 2018.
[Madras et al. 18] Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.
[Maddox et al. 19] A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019.
[Kull et al. 19] Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.
[Thulasidasan et al. 19] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.
[Ovadia et al. 19] Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.
[Hendrycks et al. 20] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.
[Filos et al. 20] Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?, ICML 2020.

Robust Deep Learning

[Szegedy et al. 14] Intriguing Properties of Neural Networks, ICLR 2014.
[Goodfellow et al. 15] Explaining and Harnessing Adversarial Examples, ICLR 2015.
[Kurakin et al. 17] Adversarial Machine Learning at Scale, ICLR 2017.
[Madry et al. 18] Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.
[Eykholt et al. 18] Robust Physical-World Attacks on Deep Learning Visual Classification.
[Athalye et al. 18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.
[Zhang et al. 19] Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.
[Carmon et al. 19] Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.
[Ilyas et al. 19] Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.
[Li et al. 19] Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.
[Tramèr and Boneh 19] Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.
[Shafahi et al. 19] Adversarial Training for Free!, NeurIPS 2019.
[Wong et al. 20] Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.
[Madaan et al. 20] Adversarial Neural Pruning with Latent Vulnerability Suppression, ICML 2020.
[Croce and Hein 20] Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks, ICML 2020.
[Maini et al. 20] Adversarial Robustness Against the Union of Multiple Perturbation Models, ICML 2020.
[Kim et al. 20] Adversarial Self-Supervised Contrastive Learning, NeurIPS 2020.
[Wu et al. 20] Adversarial Weight Perturbation Helps Robust Generalization, NeurIPS 2020.
[Laidlaw et al. 21] Perceptual Adversarial Robustness: Defense Against Unseen Threat Models, ICLR 2021.
[Pang et al. 21] Bag of Tricks for Adversarial Training, ICLR 2021.
[Madaan et al. 21] Learning to Generate Noise for Multi-Attack Robustness, ICML 2021.

[Mladenovic et al. 22] Online Adversarial Attacks, ICLR 2022.
[Zhang et al. 22] How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective, ICLR 2022.
[Carlini and Terzis 22] Poisoning and Backdooring Contrastive Learning, ICLR 2022.
[Croce et al. 22] Evaluating the Adversarial Robustness of Adaptive Test-time Defenses, ICML 2022.
[Zhou et al. 22] Understanding the Robustness in Vision Transformers, ICML 2022.

Graph Neural Networks

[Li et al. 16] Gated Graph Sequence Neural Networks, ICLR 2016.
[Hamilton et al. 17] Inductive Representation Learning on Large Graphs, NIPS 2017.
[Kipf and Welling 17] Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.
[Velickovic et al. 18] Graph Attention Networks, ICLR 2018.
[Ying et al. 18] Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.
[Xu et al. 19] How Powerful are Graph Neural Networks?, ICLR 2019.
[Maron et al. 19] Provably Powerful Graph Networks, NeurIPS 2019.
[Yun et al. 19] Graph Transformer Neteworks, NeurIPS 2019.
[Loukas 20] What Graph Neural Networks Cannot Learn: Depth vs Width, ICLR 2020.
[Bianchi et al. 20] Spectral Clustering with Graph Neural Networks for Graph Pooling, ICML 2020.
[Xhonneux et al. 20] Continuous Graph Neural Networks, ICML 2020.
[Garg et al. 20] Generalization and Representational Limits of Graph Neural Networks, ICML 2020.
[Baek et al. 21] Accurate Learning of Graph Representations with Graph Multiset Pooling, ICLR 2021.
[Liu et al. 21] Elastic Graph Neural Networks, ICML 2021.
[Li et al. 21] Training Graph Neural networks with 1000 Layers, ICML 2021.
[Jo et al. 21] Edge Representation Learning with Hypergraphs, NeurIPS 2021.

[Guo et al. 22] Data-Efficient Graph Grammar Learning for Molecular Generation, ICLR 2022.
[Geerts et al. 22] Expressiveness and Approximation Properties of Graph Neural Networks, ICLR 2022.
[Bevilacqua et al. 22] Equivariant Subgraph Aggregation Networks, ICLR 2022.
[Jo et al. 22] Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations, ICML 2022.
[Hoogeboom et al. 22] Equivariant Diffusion for Molecule Generation in 3D, ICML 2022.

Federated Learning

[Konečný et al. 16] Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.
[Konečný et al. 16] Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.
[McMahan et al. 17] Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.
[Smith et al. 17] Federated Multi-Task Learning, NIPS 2017.
[Li et al. 20] Federated Optimization in Heterogeneous Networks, MLSys 2020.
[Yurochkin et al. 19] Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.
[Bonawitz et al. 19] Towards Federated Learning at Scale: System Design, MLSys 2019.
[Wang et al. 20] Federated Learning with Matched Averaging, ICLR 2020.
[Li et al. 20] On the Convergence of FedAvg on Non-IID data, ICLR 2020.
[Karimireddy et al. 20] SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, ICML 2020.
[Hamer et al. 20] FedBoost: Communication-Efficient Algorithms for Federated Learning, ICML 2020.
[Rothchild et al. 20] FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML 2020.
[Fallah et al. 21] Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach, NeurIPS 2020.
[Reddi et al. 21] Adaptive Federated Optimization, ICLR 2021.
[Jeong et al. 21] Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning, ICLR 2021.
[Yoon et al. 21] Federated Continual Learning with Weighted Inter-client Transfer, ICML 2021.
[Li et al. 21] Ditto: Fair and Robust Federated Learning Through Personalization, ICML 2021.

Neural Architecture Search

[Zoph and Le 17] Neural Architecture Search with Reinforcement Learning, ICLR 2017.
[Baker et al. 17] Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.
[Real et al. 17] Large-Scale Evolution of Image Classifiers, ICML 2017.
[Liu et al. 18] Hierarchical Representations for Efficient Architecture Search, ICLR 2018.
[Pham et al. 18] Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.
[Luo et al. 18] Neural Architecture Optimization, NeurIPS 2018.
[Liu et al. 19] DARTS: Differentiable Architecture Search, ICLR 2019.
[Tan et al. 19] MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.
[Cai et al. 19] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.
[Zhou et al. 19] BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.
[Tan and Le 19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.
[Guo et al. 19] NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.
[Chen et al. 19] DetNAS: Backbone Search for Object Detection, NeurIPS 2019.
[Dong and Yang 20] NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.
[Zela et al. 20] Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.
[Cai et al. 20] Once-for-All: Train One Network and Specialize it for Efficient Deployment, ICLR 2020.
[Such et al. 20] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, ICML 2020.
[Liu et al. 20] Are Labels Necessary for Neural Architecture Search?, ECCV 2020.
[Dudziak et al. 20] BRP-NAS: Prediction-based NAS using GCNs, NeurIPS 2020.
[Li et al. 20] Neural Architecture Search in A Proxy Validation Loss Landscape, ICML 2020.
[Lee et al. 21] Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets, ICLR 2021.
[Mellor et al. 21] Neural Architecture Search without Training, ICML 2021.

Large Language Models

[Shoeybi et al. 19] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism, arXiv preprint, 2019.
[Raffel et al. 20] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, JMLR 2020.
[Gururangan et al. 20] Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks, ACL 2020.
[Brown et al. 20] Language Models are Few-shot Learners, NeurIPS 2020.
[Rae et al. 21] Scaling Language Models: Methods, Analysis & Insights from Training Gopher, arXiv preprint, 2021.

[Thoppilan et al. 22] LaMDA: Language Models for Dialog Applications, arXiv preprint, 2022.
[Wei et al. 22] Finetuned Langauge Models Are Zero-Shot Learners, ICLR 2022.
[Wang et al. 22] Language Modeling via Stochastic Processes, ICLR 2022.
[Alayrac et al. 22] Flamingo: a Visual Language Model for Few-Shot Learning, arXiv preprint, 2022.
[Chowdhery et al. 22] PaLM: Scaling Langauge Modeling with Pathways, arXiv preprint, 2022.
[Wei et al. 22] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS 2022.

Multimodal Generative Models

[Li et al. 19] Controllable Text-to-Image Generation, NeurIPS 2019.
[Ramesh et al. 21] Zero-Shot Text-to-Image Generation, ICML 2021.
[Radford et al. 21] Learning Transferable Visual Models From Natural Language Supervision, ICML 2021.
[Ding et al. 21] CogView: Mastering Text-to-Image Generation via Transformers, NeurIPS 2021.
[Zou et al. 22] Towards Language-Free Training for Text-to-Image Generation, CVPR 2022.

[Rombach et al. 22] High-Resolution Image Synthesis with Latent Diffusion Models, CVPR 2022.
[Nichol et al. 22] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models, ICML 2022.
[Saharia et al. 22] Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding, arXiv preprint, 2022.
[Yu et al. 22] Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, arXiv preprint, 2022.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Advanced Deep Learning @ KAIST

Course Information

Grading Policy

Tentative Schedule

Reading List

Vision Transformers

Self-Supervised Learning

Bayesian Deep Learning

Deep Generative Models

VAEs, Autoregressive and Flow-Based Generative Models

Generative Adversarial Networks

Diffusion Models

Deep Reinforcement Learning

Memory and Computation-Efficient Deep Learning

Meta Learning

Continual Learning

Interpretable Deep Learning

Reliable Deep Learning

Robust Deep Learning

Graph Neural Networks

Federated Learning

Neural Architecture Search

Large Language Models

Multimodal Generative Models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Advanced Deep Learning @ KAIST

Course Information

Grading Policy

Tentative Schedule

Reading List

Vision Transformers

Self-Supervised Learning

Bayesian Deep Learning

Deep Generative Models

VAEs, Autoregressive and Flow-Based Generative Models

Generative Adversarial Networks

Diffusion Models

Deep Reinforcement Learning

Memory and Computation-Efficient Deep Learning

Meta Learning

Continual Learning

Interpretable Deep Learning

Reliable Deep Learning

Robust Deep Learning

Graph Neural Networks

Federated Learning

Neural Architecture Search

Large Language Models

Multimodal Generative Models