#### IFDS 2021 Summer School

### Confirmed Speakers

July 26-30, 2021

#### Wisconsin funds 7 RAs for the Spring 21 Semester

The UW-Madison site of IFDS is funding several Research Assistants during Spring semester 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

#### Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

#### Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

#### Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

#### Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

#### Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

#### Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

#### Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

#### Washington adds 6 RAs to Winter 2021 term

The University of Washington site of IFDS is funding six Research Assistants in Winter quarter 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

#### Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

#### Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

#### Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

#### Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

#### Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

#### Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as

Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

#### Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

#### Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

#### Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

#### Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

#### Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

#### Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as

Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

#### Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

#### Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

#### Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

#### Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

#### Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

#### Steve Wright and collaborators win NeurIPS Test of Time Award 2020

Wright and his coauthors received the award for their 2011 paper “Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.” The paper proposed an alternative way to implement Stochastic Gradient Descent (SGD) without any locking of memory access, that “outperforms alternative schemes that use locking by an order of magnitude.” SGD is the algorithm that drives many machine learning systems.

See this [12-minute talk](https://youtu.be/c5T7600RLPc) by Chris Re about the paper.

#### Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

#### Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

#### Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

#### Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

#### Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

#### Postdoctoral fellows at IFDS@UW-Madison

The IFDS at UW-Madison is seeking applicants for one or two postdoctoral positions. Successful candidates will conduct research with members of the IFDS, both at UW-Madison and its partner sites, and will be based at the Wisconsin Institute of Discovery at UW-Madison (https://wid.wisc.edu/). A unique benefit of this position is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Travel funds will be provided. The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in data science. Additional desirable qualities include the ability to work effectively both independently and in a team, good communication skills; and a record of interdisciplinary collaborations.

To apply: https://drive.google.com/file/d/1LgeZ7fZ1cg0icuohCGAWxZlz8T9Jj66M/view?usp=sharing

#### IFDS Postdoctoral Fellow Positions Now Available U Wash

The newly-founded NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. A unique benefit is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

#### Cluster faculty positions at UW-Madison

The University of Wisconsin-Madison seeks to hire several faculty members with research and teaching interests in the foundations of data science. These positions are part of a campus-wide cluster initiative to expand and broaden expertise in the fundamentals of data science at UW-Madison. Three faculty positions will be hired in the cluster, which will build a world-class core of interdisciplinary strength in the mathematical, statistical, and algorithmic foundations of data science. This hiring initiative involves the Departments of Mathematics, Statistics, Computer Sciences, Electrical and Computer Engineering, Industrial and Systems Engineering, and the Wisconsin Institute for Discovery. It is anticipated that successful candidates will develop strong collaborations with existing data science research programs at UW-Madison — including IFDS — and with external research centers. To apply, follow the links below:

Mathematics: https://www.mathjobs.org/jobs/list/16747

Statistics: https://jobs.hr.wisc.edu/en-us/job/506886/assistant-professor-associate-professor-professor-in-statistics-cluster-hire

Engineering: https://jobs.hr.wisc.edu/en-us/job/506918/professor-cluster-hire

#### Wisconsin Adds Eight RA’s for IFDS

The UW-Madison site of IFDS is funding eight Research Assistants Fall semester to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Read about Fall 2020 RA’s at https://ifds.info/fall-20-wisconsin-ras/

#### Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

#### Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

#### Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

#### Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

#### Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

#### Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

#### Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

#### Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

#### Yin Tat Lee earns Packard Fellowship

Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows.

Read the full article at: http://ads-institute.uw.edu/IFDS/news/2020/10/15/packard/

#### UW launches Institute for Foundations of Data Science

The University of Washington will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

https://www.washington.edu/news/2020/09/01/uw-launches-institute-for-foundations-of-data-science/

#### UW–Madison to Continue Fundamental Data Science Research with Phase II Award from NSF

The Wisconsin Institute for Discovery is home to the Institute for the Foundations of Data Science, which has received Phase II funding from the National Science Foundation. https://wid.wisc.edu/uw-madison-to-continue-fundamental-data-science-research-with-phase-ii-award-from-nsf/

#### UChicago Joins Three Universities in Institute for Foundational Data Science

The Transdisciplinary Research In Principles Of Data Science (TRIPODS) program of the National Science Foundation supports this work by bridging these three disciplines for research and training. In its first round of Phase II funding, the TRIPODS program awarded $12.5 million to the Institute for Foundations of Data Science (IFDS), a four-university collaboration among the Universities of Washington, Wisconsin-Madison, California Santa Cruz, and Chicago. https://computerscience.uchicago.edu/news/article/tripods-ifds/

#### New data science institute includes a focus on ethics and algorithms

UC Santa Cruz will join three other institutions to establish a transdisciplinary research institute bringing together mathematicians, statisticians, and theoretical computer scientists to develop the theoretical foundations of the fast-growing field of data science.

Read the full article at: https://news.ucsc.edu/2020/09/data-science-institute.html

#### NSF Tripods Phase II Award

The University of Washington (UW) will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

The Institute for Foundations of Data Science (IFDS) is a collaboration between the UW and the Universities of Wisconsin-Madison, California Santa Cruz, and Chicago, with a mission to develop a principled approach to the analysis of ever-larger, more complex and potentially biased data sets that play an increasingly important role in industry, government and academia.

Support for the IFDS comes from a $12.5 million grant from the National Science Foundation and its Transdisciplinary Research in Principles of Data Science, or TRIPODS, program. Today, the NSF named IFDS as one of two institutes nationwide receiving the first TRIPODS Phase II awards. TRIPODS is tied to the NSF’s Harnessing the Data Revolution (HDR) program, which aims to accelerate discovery and innovation in data science algorithms, data cyberinfrastructure and education and workforce development.

The five-year funding plan for the IFDS Phase II includes support for new research projects, workshops, a partnership across the four research sites and students and postdoctoral scholars co-advised by faculty from different fields. Plans for education and outreach will draw on previous experience of IFDS members and leverage institutional resources at all four sites.

#### Maryam Fazel (PI) becomes first recipient of the Moorthy Family Inspiration Career Development Professorship

In recognition of her outstanding, innovative work as a researcher and educator, Maryam Fazel (IFDS PI) was recently named the inaugural recipient of the Moorthy Family Inspiration Career Development Professorship. This generous endowment was established in 2019 by Ganesh and Hema Moorthy for the purposes of recruiting, rewarding and retaining UW ECE faculty members who have demonstrated a significant amount of promise early on in their careers.

#### Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

#### Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.