Deep Learning

General Structures and Techniques in Deep Learning

1. Adversial Training

  1. Adversarial risk via optimal transport and optimal couplings
  2. Certified Robustness to Adversarial Examples with Differential Privacy
  3. Certified Adversarial Robustness via Randomized Smoothing
  4. Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
  5. Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
  6. Robust Estimation and Generative Adversarial Networks
  7. Robust Descent using Smoothed Multiplicative Noise

2. Architecture

  1. Dynamic Routing Between Capsules
  2. Matrix Capsules with Em Routing
  3. Deep Residual Learning for Image Recognition
  4. Neural Ordinary Differential Equations
  5. Augmented Neural ODEs
  6. Fully Convolutional Networks for Semantic Segmentation
  7. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
  8. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
  9. Tabnet: Attentive interpretable tabular learning (Arik and Pfister, AAAI 2021)

3. Deep Causal Inference

  1. Causal Effect Inference with Deep Latent-Variable Models
  2. Causal Deep Information Bottleneck
  3. Learning Representations for Counterfactual Inference
  4. Estimating Individual Treatment Effect: Generalization Bounds and Algorithms

4. Deep Learning Theory

  1. Deep Neural Networks as Gaussian Processes
  2. Neural Tangent Kernel: Convergence and Generalization in Neural Networks
  3. Theoretical Guarantees for Sampling and Inference in Generative Models with Latent Diffusions
  4. Deep Double Descent: Where Bigger Models and More Data Hurt
  5. On Lazy Training in Differentiable Programming

5. Feature Interaction

  1. Deep & cross network for ad click predictions. (Wang et al., KDD 2017)
  2. xDeepFM: Combining explicit and implicit feature interactions for recommender systems. (Lian et al., KDD 2018)
  3. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks (Song et al., CIKM 2019)
  4. S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization. (Zhou et al., CIKM 2020)

6. Generative Adversarial Network

  1. Generative Adversarial Nets
  2. towards principled methods for training generative adversarial networks
  3. Wasserstein GAN
  4. Improved Techniques for Training GANs
  5. Training Generative Neural Networks via Maximum Mean Discrepancy Optimization
  6. Generative Moment Matching Networks
  7. MMD GAN: Towards Deeper Understanding of Moment Matching Network
  8. On Gradient Regularizers for MMD GANs

7. Generalization

  1. Dark Knowledge
  2. Distilling the Knowledge in a Neural Network
  3. Understanding Black-box Predictions via Influence Functions
  4. Understanding Deep Learning Requires Rethinking Generalization
  5. DeepFool: a simple and accurate method to fool deep neural networks
  6. On the Accuracy of Influence Functions for Measuring Group Effects

8. Graph Models

  1. Convolutional neural networks on graphs with fast localized spectral filtering (Defferrard et al., NIPS 2016)
  2. Semi-supervised classification with graph convolutional networks (Kipf and Welling, ICLR 2017)
  3. Simplifying graph convolutional networks (Weinberger et al., ICML 2019)
  4. Inductive representation learning on large graphs (Hamilton et al., NIPS 2017)
  5. Graph attention networks (Bengio et al., NIPS 2018)

    9. Information Theory

  6. The Information Bottleneck Method

  7. Deep Variational Information Bottleneck
  8. Deep Learning and the Information Bottleneck Principle
  9. Opening the Black Box of Deep Neural Networks via Information
  10. On the Information Bottleneck Theory of Deep Learning
  11. Estimating Information Flow in Deep Neural Networks

10. Measuring Uncertainty

  1. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
  2. On Calibration of Modern Neural Networks
  3. Predicting Good Probabilities With Supervised Learning
  4. A Comprehensive Review of Neural Network-based Prediction Intervals and New Advances
  5. Lower Upper Bound Estimation Method for Construction of Neural Network-Based Prediction
  6. Estimating the Mean and Variance of the Target Probability Distribution
  7. Wind Power Interval Prediction Based on Improved PSO and BP Neural Network
  8. Prediction intervals for artificial neural networks
  9. Practical confidence and prediction intervals

11. Meta Learning

  1. Learning to Learn by Gradient Descent by Gradient Descent
  2. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. (Abbeel and Levine, ICML 2017)
  3. Optimization as a Model for Few-shot Learning
  4. Neural architecture search: A survey. (Elsken et al., 2019 JMLR)
  5. Learning Transferable Architectures for Scalable Image Recognition. (Zoph et al., 2018 CVPR )
  6. Neural architecture search with reinforcement learning. (Zoph and Le, 2017 ICLR)
  7. DARTS- Differentiable Architecture Search. (Liu et al., 2019 ICLR)

12. Natural Language Processing

  1. Teaching Machines to Read and Comprehend
  2. Using the output embedding to improve language models
  3. Tying word vectors and word classifiers: A loss framework for language modeling
  4. Smart Reply: Automated Response Suggestion for Email
  5. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling (Bat et al., 2018)
  6. Temporal Convolutional Networks for Action Segmentation and Detection
  7. Temporal Convolutional Attention-based Network For Sequence Modeling
  8. Long Short-Term Memory-Networks for Machine Reading
  9. Effective Approaches to Attention-based Neural Machine Translation
  10. Neural Machine Translation by Jointly Learning to Align and Translate
  11. Sequence to Sequence Learning with Neural Networks
  12. Show, Attend and Tell- Neural Image Caption Generation with Visual Attention
  13. Attention Is All You Need
  14. Pre-training of Deep Bidirectional Transformers for Language Understanding
  15. Neural Turing Machines
  16. Pointer Networks
  17. ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information (Sun et al., ACL 2021)
  18. Chinese NER Using Lattice LSTM (Zhang and Yang, ACL 2018)
  19. An Encoding Strategy Based Word-Character LSTM for Chinese NER (Liu et al., ACL 2019)
  20. A Neural Multi-digraph Model for Chinese NER with Gazetteers (Ding et al., ACL 2019)
  21. Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network (Sui et al., EMNLP 2019)
  22. A Lexicon-Based Graph Neural Network for Chinese NER (Gui et al., EMNLP 2019)

13. Privacy

  1. Deep Leakage from Gradients

14. Reinforcement Learning

  1. Mastering the Game of Go with Deep Neural Networks and Tree Search
  2. Mastering the Game of Go without Human Knowledge
  3. Continuous control with deep reinforcement learning
  4. Deterministic Policy Gradient Algorithms
  5. Trust Region Policy Optimization
  6. Deep Reinforcement Learning with Double Q-learning
  7. Prioritized Experience Replay
  8. Taming the Noise in Reinforcement Learning via Soft Updates

15. Semi-supervised Learning

  1. Semi-supervised Learning with Deep Generative Models
  2. Semi-supervised Learning with Ladder Networks
  3. Auxiliary Deep Generative Models
  4. Semi-Supervised Learning with Generative Adversarial Networks
  5. Semi-supervised Learning with GANs: Revisiting Manifold Regularization
  6. Data-Efficient Image Recognition with Contrastive Predictive Coding
  7. Temporal Ensembling for Semi-Supervised Learning
  8. Good Semi-supervised Learning that Requires a Bad GAN

16. Stochastic Optimization and Generalization

  1. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
  2. Sharp Minima Can Generalize For Deep Nets
  3. Entropy-sgd: Biasing Gradient Descent into Wide Valleys
  4. Geometry of Optimization and Implicit Regularization in Deep Learning
  5. Path-sgd: Path-normalized Optimization in Deep Neural Networks
  6. Norm-based capacity control in neural networks
  7. An Empirical Analysis of the Optimization of Deep Network Loss Surfaces
  8. Theory of Deep Learning II: Landscape of the Empirical Risk in Deep Learning
  9. Theory of Deep Learning III: Generalization Properties of SGD
  10. Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
  11. On Dropout and Nuclear Norm Regularization
  12. Theoretical guarantees for sampling and inference in generative models with latent diffusions
  13. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit
  14. The Landscape of Empirical Risk for Non-convex Losses

17. Time Series

  1. Deep State Space Models for Time Series Forecasting
  2. Deep Factors for Forecasting
  3. DeepAR- Probabilistic forecasting with autoregressive recurrent networks
  4. Structured Inference Networks for Nonlinear State Space Models

18. Training Technique

  1. Improving neural networks by preventing co-adaptation of feature detectors
  2. Dropout: A simple way to prevent neural networks from overfitting
  3. An empirical analysis of dropout in piecewise linear networks
  4. Understanding Dropout
  5. Asynchronous Stochastic Gradient Descent with Delay Compensation
  6. Understanding Synthetic Gradients and Decoupled Neural Interfaces
  7. Decoupled Neural Interfaces using Synthetic Gradients
  8. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
  9. How Does Batch Normalization Help Optimization

19. Transfer Learning

  1. How transferable are features in deep neural networks?
  2. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
  3. Neural Style Transfer A Review
  4. Image Style Transfer Using Convolutional Neural Networks
  5. Perceptual Losses for Real-Time Style Transfer and Super Resolution

20. Unsupervised Learning

  1. Representation Learning with Contrastive Predictive Coding
  2. A Simple Framework for Contrastive Learning of Visual Representations

results matching ""

    No results matching ""