BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//IFDS - ECPv6.0.1.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:IFDS
X-ORIGINAL-URL:https://ifds.info
X-WR-CALDESC:Events for IFDS
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240209T133000
DTEND;TZID=America/Los_Angeles:20240209T143000
DTSTAMP:20260514T210331
CREATED:20240318T212503Z
LAST-MODIFIED:20240318T212503Z
UID:2882-1707485400-1707489000@ifds.info
SUMMARY:Policy Optimization with Compatible Mirror Approximation
DESCRIPTION:Speaker Bio: Zhihan is a fourth-year PhD student in the Paul G. Allen School of Computer Science & Engineering at University of Washington\, advised by Prof. Maryam Fazel. His research interests are broadly in statistics\, optimization and machine learning. \n\n\nAbstract: We propose Compatible Mirror Policy Optimization (CoMPO)\, a framework that incorporates general function approximation into policy mirror descent methods. In contrast to the popular approach of using the $L_2$ norm to measure function approximation errors (regardless of the mirror map)\, CoMPO uses the Bregman divergence induced by the specific mirror map for policy projection. Such a compatibility bridges the gap between theory and practice: not only does it achieve fast linear convergence with general function approximation\, but it also includes several well-known practical methods as special cases\, immediately providing them strong convergence guarantees.
URL:https://ifds.info/event/policy-optimization-with-compatible-mirror-approximation/
LOCATION:Zoom
CATEGORIES:MLOpt@UWash
END:VEVENT
END:VCALENDAR