UW Allen School Professor Kevin Jamieson receives 2026 AISTATS Test of Time Award

IFDS co-Principal Investigator and Allen School Associate Professor Kevin Jamieson has been honored with the 2026 Test of Time Award from the International Conference on Artificial Intelligence and Statistics (AISTATS) for his paper “Non-stochastic Best Arm Identification and Hyperparameter Optimization,” co-authored with Ameet Talwalkar (Carnegie Mellon University). Jamieson received the award and delivered the accompanying invited talk at AISTATS 2026 on April 29.

The AISTATS Test of Time Award recognizes a paper, published at the conference roughly a decade earlier, whose ideas have had unusually durable impact on the field. The 2026 award honors the 2016 paper that introduced a rigorous theoretical foundation for early-stopping methods in hyperparameter tuning—an approach that has since become standard practice across the machine learning community.

In 2016, the dominant view of hyperparameter tuning treated each candidate configuration as a black box: pick a setting, train the model to convergence, observe the final loss, and use a Bayesian optimization surrogate to decide what to try next. Jamieson and Talwalkar argued that this view discards information that is freely available during training. Bad configurations tend to look bad early; good ones tend to pull ahead within a fraction of the total compute budget. If the algorithm is willing to abandon stragglers, the same wall-clock budget can explore many more configurations.

The technical contribution was to take a known algorithm from the stochastic multi-armed bandit literature—Successive Halving (Karnin, Koren, and Somekh, 2013)—and re-analyze it under a fundamentally different assumption: deterministic, monotonically improving learning curves with unknown convergence envelopes, rather than i.i.d. noise. The same algorithm, a very different proof. The resulting sample complexity guarantees gave practitioners theoretical justification for what some had been doing heuristically. “I’m honored by AISTATS for this recognition, and grateful to my collaborators who shaped this line of work,” Jamieson said.

The 2016 paper was the seed of a broader research program. The follow-up, “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization” (Li, Jamieson, DeSalvo, Rostamizadeh, and Talwalkar, JMLR 2018), removed the most awkward tuning knob in Successive Halving—when to make the first cut—by running several brackets in parallel at different aggressiveness levels. A few years later, “A System for Massively Parallel Hyperparameter Tuning” (Li, Jamieson, Rostamizadeh, Gonina, Hardt, Recht, and Talwalkar, MLSys 2020) introduced ASHA, an asynchronous variant designed for distributed clusters that scales linearly to hundreds of workers.

Today, implementations of these algorithms ship as the default early-stopping strategy in many of the tools practitioners use every day, including Ray Tune, Optuna, Keras Tuner, scikit-learn, Hugging Face, Weights & Biases, AWS SageMaker, and Azure Machine Learning. Successive Halving and Hyperband have, in other words, quietly become part of the infrastructure of modern machine learning.