BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//IFDS - ECPv6.0.1.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://ifds.info
X-WR-CALDESC:Events for IFDS
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20220313T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20221106T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20230312T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20231105T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
TZID:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:20240310T080000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:20241103T070000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20221209T133000
DTEND;TZID=America/Los_Angeles:20221209T133000
DTSTAMP:20260407T052814
CREATED:20221018T165647Z
LAST-MODIFIED:20221018T170200Z
UID:2322-1670592600-1670592600@ifds.info
SUMMARY:MLOpt:
DESCRIPTION:
URL:https://ifds.info/event/mlopt-6/
LOCATION:University of Washington\, Seattle\, 185 E Stevens Way NE\, Seattle\, WA\, 98195-2350\, United States
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20221216T133000
DTEND;TZID=America/Los_Angeles:20221216T133000
DTSTAMP:20260407T052814
CREATED:20221018T165647Z
LAST-MODIFIED:20221018T170221Z
UID:2323-1671197400-1671197400@ifds.info
SUMMARY:MLOpt:
DESCRIPTION:
URL:https://ifds.info/event/mlopt-7/
LOCATION:University of Washington\, Seattle\, 185 E Stevens Way NE\, Seattle\, WA\, 98195-2350\, United States
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240126T133000
DTEND;TZID=America/Los_Angeles:20240126T143000
DTSTAMP:20260407T052814
CREATED:20240318T212134Z
LAST-MODIFIED:20240318T212230Z
UID:2879-1706275800-1706279400@ifds.info
SUMMARY:How do neural networks learn features from data?
DESCRIPTION:Speaker Bio: Adit is currently the George F. Carrier Postdoctoral Fellow in the School of Engineering and Applied Sciences at Harvard. He completed his Ph.D. in electrical engineering and computer science (EECS) at MIT advised by Caroline Uhler and was a Ph.D. fellow at the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard. His research focuses on advancing theoretical foundations of machine learning and developing new methods for tackling biomedical problems. \n\n\n\nAbstract: Understanding how neural networks learn features\, or relevant patterns in data\, for prediction is necessary for their reliable use in technological and scientific applications. We propose a unifying mechanism that characterizes feature learning in neural network architectures. Namely\, we show that features learned by neural networks are captured by a statistical operator known as the average gradient outer product (AGOP). Empirically\, we show that the AGOP captures features across a broad class of network architectures including convolutional networks and large language models. Moreover\, we use AGOP to enable feature learning in general machine learning models through an algorithm we call Recursive Feature Machine (RFM). We show that RFM automatically identifies sparse subsets of features relevant for prediction and explicitly connects feature learning in neural networks with classical sparse recovery and low rank matrix factorization algorithms. Overall\, this line of work advances our fundamental understanding of how neural networks extract features from data\, leading to the development of novel\, interpretable\, and effective models for use in scientific applications.
URL:https://ifds.info/event/how-do-neural-networks-learn-features-from-data/
LOCATION:CSE (Allen) 403
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240209T133000
DTEND;TZID=America/Los_Angeles:20240209T143000
DTSTAMP:20260407T052814
CREATED:20240318T212503Z
LAST-MODIFIED:20240318T212503Z
UID:2882-1707485400-1707489000@ifds.info
SUMMARY:Policy Optimization with Compatible Mirror Approximation
DESCRIPTION:Speaker Bio: Zhihan is a fourth-year PhD student in the Paul G. Allen School of Computer Science & Engineering at University of Washington\, advised by Prof. Maryam Fazel. His research interests are broadly in statistics\, optimization and machine learning. \n\n\nAbstract: We propose Compatible Mirror Policy Optimization (CoMPO)\, a framework that incorporates general function approximation into policy mirror descent methods. In contrast to the popular approach of using the $L_2$ norm to measure function approximation errors (regardless of the mirror map)\, CoMPO uses the Bregman divergence induced by the specific mirror map for policy projection. Such a compatibility bridges the gap between theory and practice: not only does it achieve fast linear convergence with general function approximation\, but it also includes several well-known practical methods as special cases\, immediately providing them strong convergence guarantees.
URL:https://ifds.info/event/policy-optimization-with-compatible-mirror-approximation/
LOCATION:Zoom
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240216T133000
DTEND;TZID=America/Los_Angeles:20240216T143000
DTSTAMP:20260407T052814
CREATED:20240318T212625Z
LAST-MODIFIED:20240318T212625Z
UID:2884-1708090200-1708093800@ifds.info
SUMMARY:Offline Multi-task Transfer RL with Representational Penalization
DESCRIPTION:Speaker Bio: Avinandan is a second year PhD student\, advised by Maryam Fazel and Lillian Ratliff. His interests are in sequential learning and game theory. \n\n\nAbstract: We study the problem of representational transfer in offline Reinforcement Learning (RL)\, where a learner has access to episodic data from a number of source tasks collected a priori\, and aims to learn a shared representation to be used in finding a good policy for a target task. Unlike in online RL where the agent interacts with the environment while learning a policy\, in the offline setting there cannot be such interactions in either the source tasks or the target task; thus multi-task offline RL can suffer from incomplete coverage.We propose an algorithm to compute pointwise uncertainty measures for the learnt representation\, and establish a data-dependent upper bound for the suboptimality of the learnt policy for the target task. Our algorithm leverages the collective exploration done by source tasks to mitigate poor coverage at some points by a few tasks\, thus overcoming the limitation of needing uniformly good coverage for a meaningful transfer by existing offline algorithms. We complement our theoretical results with empirical evaluation on a rich-observation MDP which requires many samples for complete coverage. Our findings illustrate the benefits of penalizing and quantifying the uncertainty in the learnt representation.
URL:https://ifds.info/event/offline-multi-task-transfer-rl-with-representational-penalization/
LOCATION:CSE (Allen) 403
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240223T133000
DTEND;TZID=America/Los_Angeles:20240223T143000
DTSTAMP:20260407T052814
CREATED:20240318T212752Z
LAST-MODIFIED:20240318T212752Z
UID:2886-1708695000-1708698600@ifds.info
SUMMARY:GumbelSpec Sampling for Accelerating LLM Inference
DESCRIPTION:Bio: Tianxiao Shen is a postdoctoral scholar at the University of Washington\, working with Yejin Choi and Zaid Harchaoui. Her research interests lie in natural language processing and machine learning\, in particular developing models and algorithms for efficient\, accurate\, diverse\, flexible and controllable text generation. She received her PhD from MIT\, advised by Regina Barzilay and Tommi Jaakkola. Before that\, she did her undergrad at Tsinghua University. \n\n\n\n\nAbstract: We propose GumbelSpec sampling\, a novel algorithm that leverages smaller language models to accelerate inference of large language models without changing their output distribution. Central to our approach is the application of the Gumbel-Softmax technique to convert the stochastic decoding process into a deterministic process by integrating independently sampled Gumbel noise. Employing the same set of Gumbel noise\, we perform beam search on the smaller model to generate multiple candidate short continuations\, and then utilize tree-based attention to efficiently verify them in parallel using the larger model. GumbelSpec sampling significantly improves upon previous rejection sampling based speculative decoding methods by increasing the token acceptance rate by 1.7x-2.2x and achieving an additional speedup of 1.2x-1.5x. This results in a total speedup of 1.5x-2.6x compared to traditional autoregressive decoding.
URL:https://ifds.info/event/gumbelspec-sampling-for-accelerating-llm-inference/
LOCATION:CSE (Allen) 403
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Chicago:20240308T133000
DTEND;TZID=America/Chicago:20240308T143000
DTSTAMP:20260407T052814
CREATED:20240318T213102Z
LAST-MODIFIED:20240318T213102Z
UID:2890-1709904600-1709908200@ifds.info
SUMMARY:Low-Rank Structures in Optimal Transport
DESCRIPTION:Bio: Meyer Scetbon is currently a Research Scientist at Microsoft Research. He completed his PhD at Institut Polytechnique de Paris\, advised by M. Cuturi. He did\, as a visiting student\, his MS theses UW and Technion\, on kernel-based viewpoints on deep neural networks advised by Z. Harchaoui\, and end-to-end signal and image denoising advised by M. Elad\, respectively. \n\n\n\n\n\nAbstract: Optimal transport (OT) plays an increasingly important role in machine learning (ML) to compare probability distributions. Yet\, it poses\, in its original form\, several challenges when used for applied problems: (i) computing OT between discrete distributions amounts to solving a large and expensive network flow problem which requires a supercubic complexity in the number of points; (ii) estimating OT using sampled measures is doomed by the curse of dimensionality. These issues can be mitigated using an entropic regularization\, solved with the Sinkhorn algorithm\, which improves on both statistical and computational aspects. While much faster\, entropic OT still requires a quadratic complexity with respect to the number of points and therefore remains prohibitive for large-scale problems. In this talk\, I will present new regularization approaches for the OT problem\, as well as its quadratic extension\, the Gromov-Wasserstein (GW) problem\, which impose low-rank structures on the admissible couplings. This results in the development of new algorithms that enjoy a linear complexity both in time and memory with respect to the number of points\, enabling their applications in the large-scale setting where millions of points need to be compared. Additionally\, I will show that these new regularization schemes have better statistical performances compared to the entropic approach\, that they naturally interpolate between the Maximum Mean Discrepancy (MMD) and OT\, and that they offer general clustering methods for arbitrary geometry.Website: <https://meyerscetbon.github.io/_pages/publications/>
URL:https://ifds.info/event/low-rank-structures-in-optimal-transport/
LOCATION:CSE (Allen) 403
CATEGORIES:MLOpt@UWash
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240315T133000
DTEND;TZID=America/Los_Angeles:20240315T143000
DTSTAMP:20260407T052814
CREATED:20240318T212943Z
LAST-MODIFIED:20240318T212943Z
UID:2888-1710509400-1710513000@ifds.info
SUMMARY:Optimized Decision Making via Active Learning of Stochastic Hamiltonians
DESCRIPTION:Speaker: Prof. Chandrajit Bajaj \, UT Austin \nAbstract: A Hamiltonian represents the energy of a dynamical system in phase space with coordinates of position and momentum. The Hamilton’s equations of motion are obtainable as coupled symplectic differential equations.  In this talk I shall show how optimized decision making (action sequences) can be obtained via a reinforcement learning problem wherein the agent interacts with the unknown environment to simultaneously learn a Hamiltonian surrogate and the optimal action sequences using Hamilton dynamics\, by invoking the Pontryagin Maximum Principle. We use optimal control theory to define an optimal control gradient flow\, which guides the reinforcement learning process of the agent to progressively optimize the Hamiltonian while simultaneously converging to the optimal action sequence. Extensions to stochastic Hamiltonians leading to stochastic action sequences and the free-energy principle shall also be discussed. This is joint work with  Taemin Heo\, Minh Nguyen.
URL:https://ifds.info/event/optimized-decision-making-via-active-learning-of-stochastic-hamiltonians/
LOCATION:CSE (Allen) 403
CATEGORIES:MLOpt@UWash
END:VEVENT
END:VCALENDAR