Title: Instance Dependent Sample Complexity Bounds for Interactive Learning
Abstract: The sample complexity of an interactive learning problem, such as multi-armed bandits or reinforcement learning, is the number of interactions with nature required to output an answer (e.g., a recommended arm or policy) that is approximately close to optimal with high probability. While minimax guarantees can be useful rules of thumb to gauge the difficulty of a problem class, algorithms optimized for this worst-case metric often fail to adapt to “easy” instances where fewer samples suffice. In this talk, I will highlight some of my group’s work on algorithms that obtain optimal, finite time, instance dependent sample complexities that scale with the true difficulty of the particular instance, versus just the worst-case. In particular, I will describe a unifying experimental design based approach used to obtain such algorithms for best-arm identification for linear bandits, contextual bandits with arbitrary policy classes, and smooth losses for linear dynamical systems.
Kevin’s website: https://homes.cs.washington.edu/~jamieson/about.html
Zoom link: contact organizer
(The talk will not be recorded, we hope you can join us live!)