Deep Learning

General Structures and Techniques in Deep Learning

1. Adversial Training

Adversarial risk via optimal transport and optimal couplings
Certified Robustness to Adversarial Examples with Differential Privacy
Certified Adversarial Robustness via Randomized Smoothing
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
Robust Estimation and Generative Adversarial Networks
Robust Descent using Smoothed Multiplicative Noise

2. Architecture

Dynamic Routing Between Capsules
Matrix Capsules with Em Routing
Deep Residual Learning for Image Recognition
Neural Ordinary Differential Equations
Augmented Neural ODEs
Fully Convolutional Networks for Semantic Segmentation
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Tabnet: Attentive interpretable tabular learning (Arik and Pfister, AAAI 2021)

3. Deep Causal Inference

Causal Effect Inference with Deep Latent-Variable Models
Causal Deep Information Bottleneck
Learning Representations for Counterfactual Inference
Estimating Individual Treatment Effect: Generalization Bounds and Algorithms

4. Deep Learning Theory

Deep Neural Networks as Gaussian Processes
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Theoretical Guarantees for Sampling and Inference in Generative Models with Latent Diffusions
Deep Double Descent: Where Bigger Models and More Data Hurt
On Lazy Training in Differentiable Programming

5. Feature Interaction

6. Generative Adversarial Network

Generative Adversarial Nets
towards principled methods for training generative adversarial networks
Wasserstein GAN
Improved Techniques for Training GANs
Training Generative Neural Networks via Maximum Mean Discrepancy Optimization
Generative Moment Matching Networks
MMD GAN: Towards Deeper Understanding of Moment Matching Network
On Gradient Regularizers for MMD GANs

7. Generalization

Dark Knowledge
Distilling the Knowledge in a Neural Network
Understanding Black-box Predictions via Influence Functions
Understanding Deep Learning Requires Rethinking Generalization
DeepFool: a simple and accurate method to fool deep neural networks
On the Accuracy of Influence Functions for Measuring Group Effects

8. Graph Models

Convolutional neural networks on graphs with fast localized spectral filtering (Defferrard et al., NIPS 2016)
Semi-supervised classification with graph convolutional networks (Kipf and Welling, ICLR 2017)
Simplifying graph convolutional networks (Weinberger et al., ICML 2019)
Inductive representation learning on large graphs (Hamilton et al., NIPS 2017)
Graph attention networks (Bengio et al., NIPS 2018)

9. Information Theory
The Information Bottleneck Method
Deep Variational Information Bottleneck
Deep Learning and the Information Bottleneck Principle
Opening the Black Box of Deep Neural Networks via Information
On the Information Bottleneck Theory of Deep Learning
Estimating Information Flow in Deep Neural Networks

10. Measuring Uncertainty

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
On Calibration of Modern Neural Networks
Predicting Good Probabilities With Supervised Learning
A Comprehensive Review of Neural Network-based Prediction Intervals and New Advances
Lower Upper Bound Estimation Method for Construction of Neural Network-Based Prediction
Estimating the Mean and Variance of the Target Probability Distribution
Wind Power Interval Prediction Based on Improved PSO and BP Neural Network
Prediction intervals for artificial neural networks
Practical confidence and prediction intervals

11. Meta Learning

Learning to Learn by Gradient Descent by Gradient Descent
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. (Abbeel and Levine, ICML 2017)
Optimization as a Model for Few-shot Learning
Neural architecture search: A survey. (Elsken et al., 2019 JMLR)
Learning Transferable Architectures for Scalable Image Recognition. (Zoph et al., 2018 CVPR )
Neural architecture search with reinforcement learning. (Zoph and Le, 2017 ICLR)
DARTS- Differentiable Architecture Search. (Liu et al., 2019 ICLR)

12. Natural Language Processing

Teaching Machines to Read and Comprehend
Using the output embedding to improve language models
Tying word vectors and word classifiers: A loss framework for language modeling
Smart Reply: Automated Response Suggestion for Email
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling (Bat et al., 2018)
Temporal Convolutional Networks for Action Segmentation and Detection
Temporal Convolutional Attention-based Network For Sequence Modeling
Long Short-Term Memory-Networks for Machine Reading
Effective Approaches to Attention-based Neural Machine Translation
Neural Machine Translation by Jointly Learning to Align and Translate
Sequence to Sequence Learning with Neural Networks
Show, Attend and Tell- Neural Image Caption Generation with Visual Attention
Attention Is All You Need
Pre-training of Deep Bidirectional Transformers for Language Understanding
Neural Turing Machines
Pointer Networks
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information (Sun et al., ACL 2021)
Chinese NER Using Lattice LSTM (Zhang and Yang, ACL 2018)
An Encoding Strategy Based Word-Character LSTM for Chinese NER (Liu et al., ACL 2019)
A Neural Multi-digraph Model for Chinese NER with Gazetteers (Ding et al., ACL 2019)
Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network (Sui et al., EMNLP 2019)
A Lexicon-Based Graph Neural Network for Chinese NER (Gui et al., EMNLP 2019)

13. Privacy

Deep Leakage from Gradients

14. Reinforcement Learning

Mastering the Game of Go with Deep Neural Networks and Tree Search
Mastering the Game of Go without Human Knowledge
Continuous control with deep reinforcement learning
Deterministic Policy Gradient Algorithms
Trust Region Policy Optimization
Deep Reinforcement Learning with Double Q-learning
Prioritized Experience Replay
Taming the Noise in Reinforcement Learning via Soft Updates

15. Semi-supervised Learning

Semi-supervised Learning with Deep Generative Models
Semi-supervised Learning with Ladder Networks
Auxiliary Deep Generative Models
Semi-Supervised Learning with Generative Adversarial Networks
Semi-supervised Learning with GANs: Revisiting Manifold Regularization
Data-Efficient Image Recognition with Contrastive Predictive Coding
Temporal Ensembling for Semi-Supervised Learning
Good Semi-supervised Learning that Requires a Bad GAN

16. Stochastic Optimization and Generalization

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
Sharp Minima Can Generalize For Deep Nets
Entropy-sgd: Biasing Gradient Descent into Wide Valleys
Geometry of Optimization and Implicit Regularization in Deep Learning
Path-sgd: Path-normalized Optimization in Deep Neural Networks
Norm-based capacity control in neural networks
An Empirical Analysis of the Optimization of Deep Network Loss Surfaces
Theory of Deep Learning II: Landscape of the Empirical Risk in Deep Learning
Theory of Deep Learning III: Generalization Properties of SGD
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
On Dropout and Nuclear Norm Regularization
Theoretical guarantees for sampling and inference in generative models with latent diffusions
Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit
The Landscape of Empirical Risk for Non-convex Losses

17. Time Series

Deep State Space Models for Time Series Forecasting
Deep Factors for Forecasting
DeepAR- Probabilistic forecasting with autoregressive recurrent networks
Structured Inference Networks for Nonlinear State Space Models

18. Training Technique

Improving neural networks by preventing co-adaptation of feature detectors
Dropout: A simple way to prevent neural networks from overfitting
An empirical analysis of dropout in piecewise linear networks
Understanding Dropout
Asynchronous Stochastic Gradient Descent with Delay Compensation
Understanding Synthetic Gradients and Decoupled Neural Interfaces
Decoupled Neural Interfaces using Synthetic Gradients
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
How Does Batch Normalization Help Optimization

19. Transfer Learning

How transferable are features in deep neural networks?
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Neural Style Transfer A Review
Image Style Transfer Using Convolutional Neural Networks
Perceptual Losses for Real-Time Style Transfer and Super Resolution

20. Unsupervised Learning

Representation Learning with Contrastive Predictive Coding
A Simple Framework for Contrastive Learning of Visual Representations

Deep Learning

Deep Learning

1. Adversial Training

2. Architecture

3. Deep Causal Inference

4. Deep Learning Theory

5. Feature Interaction

6. Generative Adversarial Network

7. Generalization

8. Graph Models

9. Information Theory

10. Measuring Uncertainty

11. Meta Learning

12. Natural Language Processing

13. Privacy

14. Reinforcement Learning

15. Semi-supervised Learning

16. Stochastic Optimization and Generalization

17. Time Series

18. Training Technique

19. Transfer Learning

20. Unsupervised Learning

results matching ""

No results matching ""