Traditional Statistics
1. Clustering
- Hidden Integrality of SDP Relaxation for Sub-Gaussian Mixture Models
- Exponential Error Rates of SDP for Block Models: Beyond Grothendieck's Inequality
2. Concentration Inequalities
- Sum-of-Squares Lower Bounds for Sparse PCA
- The Masked Sample Covariance Estimator- an Analysis Using Matrix Concentration Inequalities
3. Causal Inference
- Hitchcock, Christopher, "Causal Models", The Stanford Encyclopedia of Philosophy
- Identification of Causal Effects using Instrumental Variables
- Implications of Confounding for Making Intervention Decisions Using Data Mining
- The Blessings of Multiple Causes
- Causal Inference in Statistics: An Overview
- A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
- Recursive Partitioning for Heterogeneous Causal Effects
- Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks
- Double Machine Learning for Treatment and Causal Parameters
- Double/debiased Machine Learning for Treatmentand Structural Parameters
- Double/debiased/neyman Machine Learning of Treatment Effects
- Approximate Residual Balancing: Debiased Inference of Average Treatment Effects in High Dimensions
- Conformal Inference of Counterfactuals and Individual Treatment Effects
- Identification, Inference and Sensitivity Analysis for Causal Mediation Effects
- A General Approach to Causal Mediation Analysis
- Mediation Analysis with Multiple Mediators
- Unpacking the Black Box of Causality
- Challenges Raised by Mediation Analysis in a High-Dimension Setting
- Testing Mediation Effects in High-Dimensional Epigenetic Studies
- Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies
- A Comparison of Methods to Test Mediation and Other Intervening Variable Effects
- Joint Significance Tests for Mediation Effects of Socioeconomic Adversity on Adiposity via Epigenetics
- Genome-wide Analyses of Sparse Mediation Effects Under Composite Null Hypotheses
- Sparse Principal Component-based High-dimensional Mediation Analysis
- Hypothesis Test of Mediation Effect in Causal Mediation Model with High-Dimensional Continuous Mediators
5. Differential Privacy
- Deep Learning with Gaussian Differential Privacy
- Deep Learning with Differential Privacy
- Oracle Efficient Private Non-Convex Optimization
- Smooth Sensitivity and Sampling in Private Data Analysis
- On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
- Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems
- The Cost of Privacy in Generalized Linear Models Algorithms and Minimax Lower Bounds
- Private Stochastic Convex Optimization with Optimal Rates
- Preserving Statistical Validity in Adaptive Data Analysis
- Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds
- Differentially Private Empirical Risk Minimization with Non-convex Loss Functions
- Understanding Gradient Clipping in Private SGD: A Geometric Perspective
- Private PAC Learning Implies Finite Littlestone
6. Distributed Inference
- Distributed Inference for PCA
- Communication-Efficient Distributed Statistical Inference
- Communication-Efficient Algorithms for Statistical Optimization
- Distributed Estimation of Principal Eigenspaces
- Communication-Efficient Algorithms for Distributed Stochastic Principal Component Analysis
- Communication Efficient Distributed Optimization using an Approximate Newton-type Method
- Communication-Efficient Accurate Statistical Estimation
- A Massive Data Framework for M-Estimators with Cubic-Rate
7. False Discovery Rate
- Black Box FDR
8. General Inference
- A Scalable Bootstrap for Massive Data
- A Galtonian Perspective on Shrinkage Estimators
- Predictive Inference with the Jackknife+
- Conformalized Quantile Regression
- Conformal Prediction Under Covariate Shift
9. High Dimensional Linear Regression
- Regression Shrinkage and Selection via the Lasso
- Nonparametric Instrumental Regression
- On Asymptotically Optimal Confidence Regions and Tests for High-dimensional Models
- Confidence Intervals for Low Dimensional Parameters in High Dimensional Linear Models
- Sure Independence Screening for Ultrahighdimensional Feature Space
10. Hypothesis Testing
- Testing for Independence of Large Dimensional Vectors
11. Non-smooth Estimation and Inference
- A General Bahadur Representation of M-estimators and Its Application to Linear Regression with Nonstochastic Designs (He and Shao, AoS 1996)
12. Post-Selection Inference
- Exact Post-Selection Inference, with Application to the Lasso
- Principled Statistical Inference in Data Science
13. Probability
- Optimal Estimation of a Large-dimensional Covariance Matrix under Stein’s Loss
14. Reinforcement Learning/Adaptive Inference
- Doubly-Robust Lasso Bandit
- Statistical Inference for Online Decision Making: In a Contextual Bandit Setting
- Statistical Inference for Online Decision Making via Stochastic Gradient Descent
- Statistical Inference with M-Estimators on Adaptively Collected Data (Zhang, Jason, and Murphy, NIPS 2021)
- Confidence Intervals for Policy Evaluation in Adaptive Experiments (Zhan, Wager and Athey, PNAS 2021)
- Inference for Batched Bandits (Zhang, Jason, and Murphy, NIPS 2021)
- Near-optimal Inference in Adaptive Linear Regression (Khamaru and Wainwright, 2021)
15. Semi-supervised Inference
- Semi-supervised Inference: General Theory and Estimation of Means
16. SGD Inference
- Statistical Inference for Model Parameters in Stochastic Gradient Descent
- Online Bootstrap Confidence Interval for the Stochastic Gradient Descent Estimator
- Bridging the Gap between Constant Step Size Stochastic Gradient Descent and Markov Chains
- On Constructing Confidence Region for Model Parameters in Stochastic Gradient Descent via Batch Means
- A Fully Online Approach for Covariance Matrices Estimation of Stochastic Gradient Descent Solutions
- Uncertainty Quantification for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent
- Acceleration of Stochastic Approximation by Averaging
- Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
- Asymptotic and Finite-sample Properties of Estimators Based on Stochastic Gradients
- On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
- ROOT-SGD: Sharp Nonasymptotics and Asymptotic Efficiency in a Single Algorithm
- Asymptotic Optimality in Stochastic Optimization