Numerical and Optimization
1. Dual Method
- Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
- Primal-dual Subgradient Methods for Convex Problems
- Stochastic Optimization and Sparse Statistical Recovery: Optimal Algorithms for High Dimensions
2. Generalization
- Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability
3. Matrix Completion
- Inference and Uncertainty Quantification for Noisy Matrix Completion
- Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent
4. Numerical Computation
- The Evaluation of Integrals of the form : Application to Logistic-Normal Models
- Faster Eigenvector Computation via Shift-and-Invert Preconditioning
5. Optimization
- A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method
- SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
- Accelerating Stochastic Gradient Descent using Predictive Variance Reduction
- Semi-Stochastic Gradient Descent Methods
- Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets
- Spider: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
- How to Escape Saddle Point Efficiently?
- Adaptive Sampling Strategies for Stochastic Optimization
- Implicit regularization in nonconvex statistical estimation- Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution
- Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization
- Stochastic subgradient method converges at the rate O(k −1/4 ) on weakly convex functions
- Online Non-Convex Learning- Following the Perturbed Leader is Optimal
6. Phase Retrieval/Single Index Model
- Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval
- Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees
- Phase Retrieval via Wirtinger Flow: Theory and Algorithms
- Efficient Learning of Generalized Linear and Single Index Models
7. Principal Component Analysis
- Fast and Simple PCA via Convex Optimization
- Gradient Descent Meets Shift-and-Invert Preconditioning for Eigenvector Computation
- Faster Eigenvector Computation via Shift-and-Invert Preconditioning
- LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain
- Fast Stochastic Algorithms for SVD and PCA- Convergence Properties and Convexity
8. Second Order Method
- First-Order Methods for Nonconvex Quadratic Minimization
- Newton-type methods for non-convex optimization under inexact Hessian information
- Second-Order Stochastic Optimization for Machine Learning in Linear Time
9. SGD Analysis
- Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning
- Optimal Rates for Zero-order Convex Optimization: the Power of Two Function Evaluations
- Train Faster, Generalize Better: Stability of Stochastic Gradient Descent
- Stability Analysis of SGD Through the Normalized Loss Function
10. Stochastic Gradient Langevin Dynamics
- Non-convex Learning via Stochastic Gradient Langevin Dynamics: A Nonasymptotic Analysis
- Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability
- A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics
- On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics
11. Other SGD Variants
- SignSGD: Compressed Optimisation for Non-Convex Problems
- SignSGD via Zeroth-Order Oracle