Speaker Bio: Adit is currently the George F. Carrier Postdoctoral Fellow in the School of Engineering and Applied Sciences at Harvard. He completed his Ph.D. in electrical engineering and computer science (EECS) at MIT advised by Caroline Uhler and was a Ph.D. fellow at the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard. His research focuses on advancing theoretical foundations of machine learning and developing new methods for tackling biomedical problems.
Abstract: Understanding how neural networks learn features, or relevant patterns in data, for prediction is necessary for their reliable use in technological and scientific applications. We propose a unifying mechanism that characterizes feature learning in neural network architectures. Namely, we show that features learned by neural networks are captured by a statistical operator known as the average gradient outer product (AGOP). Empirically, we show that the AGOP captures features across a broad class of network architectures including convolutional networks and large language models. Moreover, we use AGOP to enable feature learning in general machine learning models through an algorithm we call Recursive Feature Machine (RFM). We show that RFM automatically identifies sparse subsets of features relevant for prediction and explicitly connects feature learning in neural networks with classical sparse recovery and low rank matrix factorization algorithms. Overall, this line of work advances our fundamental understanding of how neural networks extract features from data, leading to the development of novel, interpretable, and effective models for use in scientific applications.