Statistical Approaches to Understanding Modern ML Methods

Aug 2-4, 2021
University of Wisconsin–Madison

When we use modern machine learning (ML) systems, the output often consists of a trained model with good performance on a test dataset. This satisfies some of our goals in performing data analysis, but leaves many unaddressed — for instance, we may want to build an understanding of the underlying phenomena, to provide uncertainty quantification about our conclusions, or to enforce constraints of safety, fairness, robustness, or privacy. As an example, classical statistical methods for quantifying a model’s variance rely on strong assumptions about the model — assumptions that can be difficult or impossible to verify for complex modern ML systems such as neural networks. 

This workshop will focus on using statistical methods to understand, characterize, and design ML models — for instance, methods that probe “black-box” ML models (with few to no assumptions) to assess their statistical properties, or tools for developing likelihood-free and simulation-based inference. Central themes of the workshop may include:

  • Using the output of a ML system to perform statistical inference, compute prediction intervals, or quantify measures of uncertainty

  • Using ML systems to test for conditional independence

  • Extracting interpretable information such as feature importance or causal relationships

  • Integrating likelihood-free inference with ML

  • Developing mechanisms for enforcing privacy, robustness, or stability constraints on the output of ML systems

  • Exploring connections to transfer learning and domain adaptation

  • Automated tuning of hyperparameters in black-box models and derivative-free optimization

Participants

Osbert Bastani
Osbert Bastani
PAC prediction sets under distribution shift
Avrim Blum
Avrim Blum
Recovering from biased data: Can fairness constraints improve accuracy
Kamalika Chaudhuri
Kamalika Chaudhuri
Statistical challenges in adversarially robust machine learning
Kyle Cranmer
Kyle Cranmer
Simulation-based inference: recent progress and open questions
Peng Ding
Peng Ding
Model-assisted analyses of cluster-randomized experiments
Dylan Foster
Dylan Foster
From predictions to decisions: A black-box approach to the contextual bandit problem
Zaid Harchaoui
Zaid Harchaoui
The statistical trade-offs of generative modeling with deep neural networks
Lucas Janson
Lucas Janson
Floodgate: Inference for variable importance with machine learning
Edward Kennedy
Edward Kennedy
Optimal doubly robust estimation of heterogeneous causal effects
Lihua Lei
Lihua Lei
Conformalized survival analysis
Sharon Li
Sharon Li
Uncovering the unknowns of deep neural networks: Challenges and opportunities
Po-Ling Loh
Po-Ling Loh
Robust W-GAN-based estimation under Wasserstein contamination
Aaditya Ramdas
Aaditya Ramdas
A quick tour of distribution-free post-hoc calibration
Hanie Sedghi
Hanie Sedghi
The deep bootstrap framework: Good online learners are good offline generalizers
Ryan Tibshirani
Ryan Tibshirani
Discrete splines: Another look at trend filtering and related problems
Vladimir Vovk
Vladimir Vovk
Conformal prediction, testing, and robustness
Yao Xie
Yao Xie
Conformal prediction intervals for dynamic time series

Schedule

Slides

All slides for the workshop are included in the gallery below. Click on a poster to see it in full screen.