Traditional Statistics
1. Clustering
- Hidden Integrality of SDP Relaxation for Sub-Gaussian Mixture Models
 
- Exponential Error Rates of SDP for Block Models: Beyond Grothendieck's Inequality
 
2. Concentration Inequalities
- Sum-of-Squares Lower Bounds for Sparse PCA
 
- The Masked Sample Covariance Estimator- an Analysis Using Matrix Concentration Inequalities
 
3. Causal Inference
- Hitchcock, Christopher, "Causal Models", The Stanford Encyclopedia of Philosophy
 
- Identification of Causal Effects using Instrumental Variables
 
- Implications of Confounding for Making Intervention Decisions Using Data Mining
 
- The Blessings of Multiple Causes
 
- Causal Inference in Statistics: An Overview
 
- A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
 
- Recursive Partitioning for Heterogeneous Causal Effects
 
- Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks
 
- Double Machine Learning for Treatment and Causal Parameters
 
- Double/debiased Machine Learning for Treatmentand Structural Parameters
 
- Double/debiased/neyman Machine Learning of Treatment Effects
 
- Approximate Residual Balancing: Debiased Inference of Average Treatment Effects in High Dimensions
 
- Conformal Inference of Counterfactuals and Individual Treatment Effects
 
- Identification, Inference and Sensitivity Analysis for Causal Mediation Effects
 
- A General Approach to Causal Mediation Analysis
 
- Mediation Analysis with Multiple Mediators
 
- Unpacking the Black Box of Causality
 
- Challenges Raised by Mediation Analysis in a High-Dimension Setting
 
- Testing Mediation Effects in High-Dimensional Epigenetic Studies
 
- Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies
 
- A Comparison of Methods to Test Mediation and Other Intervening Variable Effects
 
- Joint Significance Tests for Mediation Effects of Socioeconomic Adversity on Adiposity via Epigenetics
 
- Genome-wide Analyses of Sparse Mediation Effects Under Composite Null Hypotheses
 
- Sparse Principal Component-based High-dimensional Mediation Analysis
 
- Hypothesis Test of Mediation Effect in Causal Mediation Model with High-Dimensional Continuous Mediators
 
5. Differential Privacy
- Deep Learning with Gaussian Differential Privacy
 
- Deep Learning with Differential Privacy
 
- Oracle Efficient Private Non-Convex Optimization
 
- Smooth Sensitivity and Sampling in Private Data Analysis
 
- On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
 
- Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems
 
- The Cost of Privacy in Generalized Linear Models Algorithms and Minimax Lower Bounds
 
- Private Stochastic Convex Optimization with Optimal Rates
 
- Preserving Statistical Validity in Adaptive Data Analysis
 
- Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds
 
- Differentially Private Empirical Risk Minimization with Non-convex Loss Functions
 
- Understanding Gradient Clipping in Private SGD: A Geometric Perspective
 
- Private PAC Learning Implies Finite Littlestone
 
6. Distributed Inference
- Distributed Inference for PCA
 
- Communication-Efficient Distributed Statistical Inference
 
- Communication-Efficient Algorithms for Statistical Optimization
 
- Distributed Estimation of Principal Eigenspaces
 
- Communication-Efficient Algorithms for Distributed Stochastic Principal Component Analysis
 
- Communication Efficient Distributed Optimization using an Approximate Newton-type Method
 
- Communication-Efficient Accurate Statistical Estimation
 
- A Massive Data Framework for M-Estimators with Cubic-Rate
 
7. False Discovery Rate
- Black Box FDR
 
8. General Inference
- A Scalable Bootstrap for Massive Data
 
- A Galtonian Perspective on Shrinkage Estimators
 
- Predictive Inference with the Jackknife+
 
- Conformalized Quantile Regression
 
- Conformal Prediction Under Covariate Shift
 
9. High Dimensional Linear Regression
- Regression Shrinkage and Selection via the Lasso
 
- Nonparametric Instrumental Regression
 
- On Asymptotically Optimal Confidence Regions and Tests for High-dimensional Models
 
- Confidence Intervals for Low Dimensional Parameters in High Dimensional Linear Models
 
- Sure Independence Screening for Ultrahighdimensional Feature Space
 
10. Hypothesis Testing
- Testing for Independence of Large Dimensional Vectors
 
11. Non-smooth Estimation and Inference
- A General Bahadur Representation of M-estimators and Its Application to Linear Regression with Nonstochastic Designs (He and Shao, AoS 1996)
 
12. Post-Selection Inference
- Exact Post-Selection Inference, with Application to the Lasso
 
- Principled Statistical Inference in Data Science
 
13. Probability
- Optimal Estimation of a Large-dimensional Covariance Matrix under Stein’s Loss
 
14. Reinforcement Learning/Adaptive Inference
- Doubly-Robust Lasso Bandit
 
- Statistical Inference for Online Decision Making: In a Contextual Bandit Setting
 
- Statistical Inference for Online Decision Making via Stochastic Gradient Descent
 
- Statistical Inference with M-Estimators on Adaptively Collected Data (Zhang, Jason, and Murphy, NIPS 2021)
 
- Confidence Intervals for Policy Evaluation in Adaptive Experiments (Zhan, Wager and Athey, PNAS 2021)
 
- Inference for Batched Bandits (Zhang, Jason, and Murphy, NIPS 2021)
 
- Near-optimal Inference in Adaptive Linear Regression (Khamaru and Wainwright, 2021) 
 
15. Semi-supervised Inference
- Semi-supervised Inference: General Theory and Estimation of Means
 
16. SGD Inference
- Statistical Inference for Model Parameters in Stochastic Gradient Descent
 
- Online Bootstrap Confidence Interval for the Stochastic Gradient Descent Estimator
 
- Bridging the Gap between Constant Step Size Stochastic Gradient Descent and Markov Chains
 
- On Constructing Confidence Region for Model Parameters in Stochastic Gradient Descent via Batch Means
 
- A Fully Online Approach for Covariance Matrices Estimation of Stochastic Gradient Descent Solutions
 
- Uncertainty Quantification for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent
 
- Acceleration of Stochastic Approximation by Averaging
 
- Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
 
- Asymptotic and Finite-sample Properties of Estimators Based on Stochastic Gradients
 
- On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
 
- ROOT-SGD: Sharp Nonasymptotics and Asymptotic Efficiency in a Single Algorithm
 
- Asymptotic Optimality in Stochastic Optimization