Federal budget cuts threaten to decimate America’s AI superiority—and other countries are watching

Data Meets Dynamics: Workshop on Data Assimilation for Complex Systems and Applications coming to Wisconsin in August 2025

IFDS is excited to announce a two-day NSF-supported workshop hosted by the Institute for Foundations of Data Science (IFDS) at the University of Wisconsin-Madison. The workshop, titled “Data Meets Dynamics: Workshop on Data Assimilation for Complex Systems and Applications,” will take place on August 21–22, 2025 (Thursday–Friday). This event is supported by NSF and IFDS.

This event is a collaboration between the IFDS at UW-Madison and the Data Science Center at Brigham Young University (BYU), a key partner of the UW IFDS. More details about the workshop can be found on our website:

Workshop Website

The workshop will feature a range of activities, including oral presentations, poster sessions, and lightning talks, offering students and junior researchers an excellent opportunity to showcase their work. We encourage you to share this announcement with your department, your group members, or junior researchers who are interested in attending.

As the workshop is supported by the NSF, we are pleased to offer partial travel funding to selected participants. Additionally, there is no registration fee for this event. To apply for travel support and provide information about your participation, please complete the following form no later than March 1:

Application Form

This workshop brings together researchers and practitioners to explore the broad landscape of data assimilation, emphasizing both theoretical foundations and practical applications. On the theoretical front, the workshop will delve into topics such as nudging data assimilation and its connections to partial differential equations (PDEs), control theory, and error analysis. For practical methods, we will highlight a range of Bayesian data assimilation techniques, including the ensemble Kalman filter and the particle filter, which represent discrete-in-time approaches. Continuous-intime frameworks, such as nudging methods, conditional Gaussian nonlinear data assimilation, and the ensemble Kalman-Bucy filter, will also be discussed, with real-world applications in climate science, atmospheric and ocean modeling, and engineering systems. A key focus of the workshop is to strengthen interdisciplinary connections between data assimilation and tools such as machine learning, stochastic models, parameter estimation, optimal control, and model identification. By fostering discussions among different communities, the event aims to bridge gaps between theory and practice, encourage collaboration, and inspire new research directions. Additionally, the workshop will provide an excellent opportunity for young researchers to gain exposure to various methods, equipping them with tools to address challenges in complex dynamical systems.

Former IFDS Wisc RA Named Rising Star in Machine Learning

Ying Fan, a former IFDS Wisc RA advised by Kangwook Lee, has been selected as a 2024 Rising Star in Machine Learning. The award recognizes her outstanding achievements and potential in advancing the field of machine learning. She was invited to deliver a talk at the Rising Star in Machine Learning series, held on December 5 and 6 at the University of Maryland, College Park.

IFDS members involved in new AI institute

UChicago and UW-Madison are part of a new AI Institute for the SkAI that has foundational components stemming from IFDS research

PI Willett Receives SIAM Activity Group on Data Science Career Prize

IFDS PI Rebecca Willett was the recipient of the SIAM Activity Group on Data Science Career Prize at the 2024 SIAM Conference on Mathematics of Data Science (MDS24). Link: https://www.siam.org/publications/siam-news/articles/2024-october-prize-spotlight/

Andrew Lowy

Andrew Lowy (alowy@wisc.edu, https://sites.google.com/view/andrewlowy) joined the University of Wisconsin-Madison (UW-Madison) in September 2023 as an IFDS postdoc with Stephen J. Wright. Before joining IFDS at UW-Madison, he obtained his PhD in Applied Math at University of Southern California under the supervision of Meisam Razaviyayn, where he was awarded the 2023 Center for Applied Mathematical Sciences (CAMS) Graduate Student Prize for excellence in research with a substantial mathematical component.

Andrew’s research interests lie in trustworthy machine learning and optimization, with a focus on privacy, fairness, and robustness. His main area of expertise is differentially private optimization for machine learning. Andrew’s work characterises the fundamental statistical and computational limits of differentially private optimization problems that arise in modern machine learning. He also develops scalable optimization algorithms that attain these limits.

IFDS at University of Washington Welcomes New Postdoctoral Fellows for 2024-2025

At the University of Washington, several new IFDS postdoctoral fellows have recently joined us. We are excited to welcome them to IFDS!

Natalie Frank

ppointer2024-10-24T12:50:15-05:00

Washington

Mo Zhou

ppointer2024-10-24T12:50:30-05:00

Washington

Libin Zhu

ppointer2024-10-24T12:50:42-05:00

Washington

Natalie Frank

Natalie Frank (natalief@uw.edu) https://natalie-frank.github.io/ joined the University of Washington as a Pearson and IFDS Fellow in September 2024 working with Bamdad Hosseini and Maryam Fazel. Prior to joining the University of Washington, she obtained her Ph.D. in Mathematics at NYU advised by Jonathan Niles-Weed.

Natalie researches the theory of adversarial learning using tools from analysis, probability, optimization, and PDEs. During her PhD, she focused on studying adversarial learning without model assumptions. Some topics she would like to explore at UW include leveraging convex optimization to enhance adversarial training, using these insights to perform distributionally robust learning for PDEs, and understanding how these topics interact with flatness of minimizers.

Mo Zhou

Mo Zhou (mozhou717@gmail.com) https://mozhou7.github.io is an IFDS postdoc (started October 2024) at the University of Washington working with Simon Du and Maryam Fazel. He received his Ph.D. in Computer Science from Duke University, advised by Rong Ge.

Mo’s research focuses on the mathematical foundations of machine learning, particularly in optimization related deep learning theory. His work involves analyzing the training dynamics of overparametrized models in the feature learning regime to uncover underlying mechanisms. He also investigates phenomena observed in deep learning practice by theoretically analyzing simplified models and making predictions based on these insights.

Libin Zhu

Libin Zhu (libinzhu@uw.edu) https://libinzhu.github.io/ joined the University of Washington in July 2024. He is an IFDS postdoc scholar working with Dmitriy Drusvyatskiy and Maryam Fazel. He received my PhD in Computer Science from UCSD, advised by Mikhail Belkin.

Libin’s research focuses on the optimization and mathematical foundations of deep learning. He has worked on the dynamics of neural networks under kernel regime and feature learning regime. He plans to investigate the mechanism of how neural networks learn features and develop machine learning models that can explicitly learn features.

Predicting the Replicability of Social and Behavioral Science Claims in COVID-19 Preprints

Washington Adds 5 New RAs

At IFDS at the University of Washington, we are excited to introduce our cohort of 5 new research assistants in the fall 2024 quarter. Our RAs collaborate across disciplines on IFDS research. Each is advised by a primary and a secondary adviser, who are core members or affiliates of IFDS.

Weihang Xu

Washington

Qiwen Cui

Washington

Facheng Yu

Washington

Begoña García Malaxechebarría

Washington

Garrett Mulcahy

Washington

Weihang Xu

Weihang Xu (Computer Science and Engineering) works with Simon S. Du (Computer Science and Engineering) and Maryam Fazel (Electrical and Computer Engineering). He is interested in optimization theory and the physics of deep learning models. His current research focuses on theories of Neural Networks, clustering algorithms, and the attention mechanism.

Qiwen Cui

Qiwen Cui (Computer Science and Engineering) works with Simon S. Du (Computer Science and Engineering) and Maryam Fazel (Electrical and Computer Engineering). He is interested in reinforcement learning theory, in particular the multi-agent setting. He is also interested in the role of RLHF in the performance of LLM.

Facheng Yu

Facheng Yu (Statistics) works with Zaid Harchaoui (Statistics) and Alex Luedtke (Statistics) on learning theory and semi-parametric models. His current research focuses on orthogonal statistical learning and stochastic optimization. He is also interested in high-dimensional statistics and non-asymptotic statistical inference.

Begoña García Malaxechebarría

Begoña García Malaxechebarría (Mathematics) works with Dmitriy Drusvyatskiy (Mathematics) and Maryam Fazel (Electrical and Computer Engineering). She is interested in the optimization and mathematical foundations of deep learning. Her current research focuses on analyzing and scaling limits of stochastic algorithms for large-scale problems in data science.

Garrett Mulcahy

Garrett Mulcahy (Mathematics) works with Soumik Pal (Mathematics) and Zaid Harchaoui (Statistics) on problems at the intersection of optimal transport and machine learning. He is currently working on small time approximations of entropic regularized optimal transport plans (i.e. Schr\’’{o}dinger bridges). Additionally, he is interested in the statistical estimation of optimal transport-related quantities.

Iterative methods for nonlinear tomographic reconstruction

Policy Optimization for RL and Control

IFDS affiliate faculty earn prestigious UWisc awards

Stephen Wright awarded Dantzig Prize

Tianxiao Shen

Tianxiao Shen is an IFDS postdoc scholar at the University of Washington, working with Yejin Choi and Zaid Harchaoui. Previously, she received her PhD from MIT, advised by Regina Barzilay and Tommi Jaakkola. Before that, she did her undergrad at Tsinghua University, where she was a member of the Yao Class.

Tianxiao has broad interests in natural language processing, machine learning, and deep learning. More specifically, she studies language models and develops algorithms to facilitate efficient, accurate, diverse, flexible, and controllable text generation.

Wright elected to prestigious National Academy of Engineering

Hanbaek Lyu paper featured in Nature Communications

Hanbaek Lyu, Assistant Professor of Mathematics, University of Wisconsin–Madison and IFDS faculty, was published in the January 3, 2023 issue of Nature Communications. His article Learning low-rank latent mesoscale structures in networks is summarized below.

Introduction:

We present a new approach to describe low-rank mesoscale structures in networks. We find that many real-world networks possess a small set of `latent motifs’ that effectively approximate most subgraphs at a fixed mesoscale. Our work has applications in network comparison and network denoising.

Content:

Researchers in many fields use networks to encode interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In many studies of mesoscale structures, subgraph patterns (i.e., the connection patterns of small sets of nodes) have been studied as building blocks of network structure at various mesoscales. In particular, researchers often identify motifs as k-node subgraph patterns (where k is typically between 3 and 5) of a network. These patterns are unexpectedly more commonthan in a baseline network (which is constructed through a random process). In the last two decades, the study of motifs has yielded insights into networked systems in many areas, including biology, sociology, and economics. However, not much is known about how to use such motifs (or related mesoscale structures), after their discovery, as building blocks to reconstruct a network.

Networks have low-rank subgraph structures

The figure above shows examples of 20-node subgraphs that include a path (in red) that spans the entire subgraph. These subgraphs are subsets of various real-world and synthetic networks. The depicted subgraphs have diverse connection patterns, which depend on the structure of the original network. One of our key findings is that, roughly speaking, thesubgraphs have specific pattern. Specifically, we find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. The figure below shows “network dictionaries” of 25 latent motifs for Facebook “friendship” networks from the universities UCLA and Caltech Facebook. By using subgraph sampling and nonnegative matrix factorization, we are able to discover these latent motifs.

The ability to encode and reconstruct networks using a small set of latent motifs has many applications in network analysis, including network comparison, network denoising, and edge inference. Such low-rank mesoscale structures allow one to reconstruct networks by approximating subgraphs of a network using combinations of latent motifs. In the animation below, we demonstrate our network-reconstruction algorithm using latent motifs. First, we repeatedly sample a k-node subgraph by sampling a path of k nodes. We then use the latent motifs to approximate the sampled subgraph. (See panels a1–a3 in the figure below.) We then replace the original subgraph with the approximated subgraph. By doing this over and over, we gradually form a weighted reconstructed network. The weights indicate how confident we are that the associated edges of the reconstructed network are also edges of the original observed network.

Selecting appropriate latent motifs is essential for our approach. These latent motifs act as building blocks of a network; they help us recreate different parts of it. If we don’t pick an appropriate set of latent motifs, we cannot expect to accurately capture thestructure of a network. This is analogous tousing the correct puzzle pieces toassemble a picture If weuse the wrong pieces, we will assemble a picture that doesn’t match original picture.

Motivating Application: Anomalous-subgraph detection

A common problem in network analysis is the detection of anomalous subgraphs of a network. The connection pattern of an anomalous subgraph distinguishes it from the rest of a network. This anomalous-subgraph-detection problem has numerous high-impact applications, including in security, finance, and healthcare.

A simple conceptual framework for anomalous-subgraph detection is the following: Learn “normal subgraph patterns” in an observed network and then detect subgraphs in the observed network that deviate significantly from them. We can turn this high-level idea into a concrete algorithm by using latent motifs and network reconstruction, as the figure above illustrates. From an observed network (panel a), which consists of the original network (panel b) and an anomalous subgraph (panel c), we compute latent motifs (panel d) that can successfully approximate the k-node subgraphs of the observed network. A key observation is that these subgraphs should also describe the normal subgraph patterns of the observed network. Reconstructing the observed network using its latent motifs yields a weighted network (panel e) in which edges with positive and small weights deviate significantly from the normal subgraph patterns, which are captured by the latent motifs. Therefore, it is likely that these edges are anomalous. The suspicious edges (panel f) are the edges in the weighted reconstructed network that have positive weights that are less than a threshold. One can determine the threshold using a small set of known true edges and known anomalous edges. The suspicious edges match well with the anomalous edges (panel c).

Our method can also be used for “link-prediction” problems, in which one seeks to figure out the most likely new edges given an observed network structure. The reasoning is similar to that for anomalous-subgraph detection. We learn latent motifs from an observed network and use them to reconstruct the network. The edges with the largest weights in the reconstructed network that were non-edges in the observed network are our predictions as the most likely edges. As we show in the figure below, our latent-motif approach is competitive with popular methods for anomalous-subgraph detection and link prediction.

Conclusion

We introduced a mesoscale network structure, which we call latent motifs, that consists of k-node subgraphs that are building blocks of the connected k-node subgraphs of a network. By using combinations of latent motifs, we can approximate k-node subgraphs that are induced by uniformly random k-paths of a network. We also established algorithmically and theoretically that one can accurately approximate a network if one has a dictionary of latent motifs that can accurately approximate mesoscale structures of the network.

Our computational experiments demonstrate that latent motifs can have distinctive network structures and that various social, collaboration, and protein–protein interaction networks have low-rank mesoscale structures, in the sense that a few learned latent motifs are able to reconstruct, infer, and denoise the edges of a network. We hypothesize that such low-rank mesoscale structures are a common feature of networks beyond the examined networks.

IFDS Postdoctoral Fellow Positions Available at the University of Washington

Deadline Extended!

The NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. A unique benefit is the rich set of collaborative opportunities available. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in machine learning and data science. Desirable qualities include the ability to work effectively both independently and in a team, good communication skills, and a record of interdisciplinary collaborations.

Interested applicants should send a statement of research interests, CV, and a cover letter to the address ifds.UW.positions@gmail.com, with the subject line “<applicant’s first and last name> Application”. Applicants should also arrange for two letters of reference to be sent to this same address (directly from letter writers), with the subject line “<applicant’s first and last name> Letter”.

In their cover letter, applicants should make sure to indicate potential faculty mentors (primary and secondary) at UW IFDS (see here for UW core faculty that can serve as main mentors, and here for broader affiliate members).

Full consideration will be given to applications received by January 30, 2024. The expected start date is in summer 2024, but earlier start dates will be considered. Selected candidates will be invited to short virtual (Zoom) interviews. For questions regarding the position, please contact the IFDS director Prof. Maryam Fazel at mfazel@uw.edu.

The base salary range for this position will have a full-time monthly salary range of $5880 – $6580 per month, commensurate with experience and qualifications, or as mandated by a U.S. Department of Labor prevailing wage determination.

Equal Employment Opportunity Statement
The University of Washington is an affirmative action and equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, national origin, sex, sexual orientation, marital status, pregnancy, genetic information, gender identity or expression, age, disability, or protected veteran status.

Commitment to Diversity
The University of Washington is committed to building diversity among its faculty, librarian, staff, and student communities, and articulates that commitment in the UW Diversity Blueprint (http://www.washington.edu/diversity/diversity-blueprint/). Additionally, the University’s Faculty Code recognizes faculty efforts in research, teaching and/or service that address diversity and equal opportunity as important contributions to a faculty member’s academic profile and
responsibilities (https://www.washington.edu/admin/rules/policies/FCG/FCCH24.html#2432)

UChicago statistician Rina Foygel Barber awarded MacArthur Fellowship

GraPFiCs 2023 – Foundations of Fairness, Privacy and Causality in Graphs

Karan Srivastava

Karan Srivastava (Mathematics), advised by Jordan Ellenberg (Mathematics) is interested in applying machine learning techniques to generate interpretable, generalizable data for gaining a deeper understanding of problems in pure mathematics. Currently, he is working on generating examples in additive combinatorics and convex geometry using deep reinforcement learning.

Joe Shenouda

Joe Shenouda (Electrical and Computer Engineering) advised by Rob Nowak (ECE), Kangwook Lee (ECE) and Stephen Wright (Computer Sciences) is interested in developing a theoretical understanding of deep learning. Currently, he is working towards precisely characterising the benefits of depth in deep neural networks.

Alex Hayes

Alex Hayes (Statistics), advised by Keith Levin (Statistics) and Karl Rohe (Statistics), is interested in causal inference on networks. He is currently working on methods to estimate peer-to-peer contagion in noisily-observed networks.

Matthew Zurek

Matthew Zurek (Computer Sciences), advised by Yudong Chen (Computer Sciences) and Simon Du (Computer Science and Engineering) is interested in reinforcement learning theory. He is currently working on algorithms with improved instance-dependent sample complexities.

Jitian Zhao

Jitian Zhao (Statistics), advised by Karl Rohe (Statistics) and Fred Sala (Computer Sciences) is interested in graph analysis and non-Euclidean models. She is currently working on community detection and interpretation under the setting of large directed anonymous network.

Thanasis Pittas

Thanasis Pittas (Computer Science), advised by Ilias Diakonikolas (Computer Science) is working on robust statistics. His aim is to design efficient algorithms that can tolerate a constant fraction of the data being corrupted. The significance of computational efficiency arises from the high-dimensional nature of datasets in modern applications. To complete our theoretical understanding, it is also imperative to study the inherent trade-offs between computational efficiency and statistical performance of these algorithms.

William Powell

William Powell (Mathematics), advised by Hanbaek Lyu (Mathematics) and Qiaomin Xie (Industrial and Systems Engineering) is interested in stochastic optimization. His current work focuses on a variance reduced optimization algorithm in the context of dependent data sampling.

Wisconsin Funds 8 RAs for Fall 2023

The UW-Madison site of IFDS is funding several Research Assistants during Fall 2023 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Karan Srivastava

Wisconsin

Joe Shenouda

Wisconsin

Alex Hayes

Wisconsin

Matthew Zurek

Wisconsin

Jitian Zhao

Wisconsin

Thanasis Pittas

Wisconsin

William Powell

Wisconsin

Ziqian Lin

Wisconsin

Ziqian Lin

Ziqian Lin (Computer Science), advised by Kangwook Lee (Electrical and Computer Engineering) and Hanbaek Lyu (Mathematics), works on the intersection of machine learning and deep learning. Currently, he is working on model compositions of frozen pre-trained models and NLP topics including understanding the phenomena of in-context learning, and watermarking LLMs.

Flat Minima Generalize for Low-rank Matrix Recovery

Data Assimilation to Reconstruct Spreading of COVID-19

Mathematical Foundations of Interactive Machine Learning

U Washington IFDS faculty Dmitriy Drusvyatskiy receives the SIAM Activity Group on Optimization Best Paper Prize

We are pleased to announce that Dmitriy Drusvyatskiy, Professor of Mathematics, is a co recipient of the SIAM Activity Group on Optimization Best Paper Prize, together with Damek Davis, for the paper "Stochastic model-based minimization of weakly convex functions." This prize is awarded every three years to the author(s) of the most outstanding paper, as determined by the prize committee, on a topic in optimization published in the four calendar years preceding the award year. Congratulations Dima!"

Contact

For more information, please contact the site director of your choice or email the webmaster who will try to direct your inquiry.

Making Machine Learning Stable

Writing and Data

IFDS director Maryam Fazel receives Moorthy Professorship

Summer 2023 IFDS events coming soon!

IFDS workshop brings together data science experts to explore ways of making algorithms that learn from data more robust and resilient

The workshop focused on exploring “distributional robustness.” This is a promising framework and research area in data science aimed at addressing complex shifts and changes in data, which are fielded by automated devices and processes such as the algorithms used in AI and machine learning. Read more…

IFDS Affiliates Publish New Works on Robustness and Optimization

Algorithmic Robust Statistics

by Ilias Diakonikolas and Daniel Kane

Published by Cambridge University Press
Available online and free to download.

Optimization for Data Analysis
by Stephen Wright and Benjamin Recht
Published by Cambridge University Press

IFDS Welcomes Postdoctoral Affiliates for 2022-23

The ranks of IFDS Postdocs have grown remarkably in 2022-23 – an unprecedented 11 postdoctoral researchers are now affiliated with IFDS. Some have already been with us for a year, some have just joined. One (Lijun Ding) has moved from IFDS at Washington to IFDS at Wisconsin.

We take this opportunity to introduce the current cohort of postdocs and share a little information about each.

Andrew Lowy

Wisconsin

Natalie Frank

Washington

Mo Zhou

Washington

Libin Zhu

Washington

Tianxiao Shen

Washington

Jeongyeol Kwon

Wisconsin

Sushrut Karmalkar

Wisconsin

Ruhui Jin

Wisconsin

David Clancy

Wisconsin

Jake Soloff

Chicago

Jasper Lee

Wisconsin

Greg Canal

Wisconsin

Ahmet Alacaoglu

Wisconsin

Jeongyeol Kwon

Jeongyeol Kwon (jeongyeol.kwon@wisc.edu) joined UW-Madison in September 2022. He is doing a postdoc with Robert Nowak. He completed his PhD at The University of Texas at Austin in August 2022, advised by Constantine Caramanis.

Jeongyeol is broadly interested in theoretical aspects of machine learning and optimization. During his Ph.D., Jeongyeol has focused on fundamental questions arising from statistical inference and sequential decision making in the presence of latent variables. In his earlier PhD years, he worked on the analysis of the Expectation-Maximization algorithm and showed its convergence and statistical optimality properties. More recently, he has been more involved in reinforcement learning theory with partial observations inspired by real-world examples. He plans to enlarge his scope to more diverse research topics including stochastic optimization and more practical approaches for RL and other related problems.

Sushrut Karmalkar

Sushrut Karmalkar joined UW-Madison as a postdoc in September 2021, mentored by Ilias Diakonikolas in the Department of Computer Sciences and supported by the 2021 CI Fellowship. He completed his Ph.D. at The University of Texas at Austin, advised by Prof. Adam Klivans.

During his Ph.D., Sushrut worked on various aspects of the theory of machine learning, including algorithmic robust statistics, theoretical guarantees for neural networks, and solving inverse problems via generative models.

During the first year of his postdoc, he focused on understanding the problem of sparse mean estimation in the presence of various, extremely aggressive noise models. In this coming year, he plans to work on getting stronger lower bounds for these problems as well as work on getting improved guarantees for slightly weaker noise models.

Sushrut is on the job market this year (2022-2023)!

Ruhui Jin

Ruhui Jin (rjin@math.wisc.edu) joined UW-Madison in August 2022. She is a Van Vleck postdoc in the Department of Mathematics, hosted by Qin Li. She is partly supported by IFDS. She completed her PhD in mathematics from UT-Austin in 2022, advised by Rachel Ward.

Her research is in mathematics of data science. Specifically, her thesis focused on dimensionality reduction for tensor-structured data. Currently, she is broadly interested in data-driven methods for learning complex systems.

David Clancy

David Clancy (dclancy@math.wisc.edu) joined UW-Madison in August of 2022. He is doing a postdoc with Hanbaek Lyu and Sebastian Roch in Mathematics and is partly supported by IFDS. He completed his PhD in mathematics at the University of Washington in Seattle working under the supervision of Soumik Pal. During graduate school he worked on problems related to the metric-measure space structure of sparse random graphs with small surplus as their size grows large. He plans to investigate a wider range of topics related to the component structure of random graphs with an underlying community structure as well as inference problems on trees and graphs.

Jake Soloff

Jake is a postdoctoral researcher working with Rina Foygel Barber and Rebecca Willett in the Department of Statistics at the University of Chicago. Previously, he obtained my PhD from the Department of Statistics at UC Berkeley, co-advised by Aditya Guntuboyina and Michael I. Jordan. Jake received his ScB in mathematics from Brown University in 2016.

Ross Baczar

Ross Boczar (www.rossboczar.com, rjboczar@uw.edu) is a Postdoctoral Scholar at the University of Washington, associated with the Department of Electrical and Computer Engineering, the Institute for Foundations of Data Science, and the eScience Institute. He is advised by Prof. Maryam Fazel and Prof. Lillian J. Ratliff. His research interests include control theory, statistical learning theory, optimization, and other areas of applied mathematics. Recently, he has been exploring adversarial learning scenarios where a group of colluding users can learn and then exploit a firm’s deployed ML classifier.

Stephen Mussmann

Stephen Mussmann (somussmann@gmail.com) is an IFDS postdoc (started September 2021) at the University of Washington working with Ludwig Schmidt and Kevin Jamieson. Steve received his Ph.D. in Computer Science from Stanford University, advised by Percy Liang.

Steve researches data selection for machine learning; including work in fields such as sub areas such as active learning, adaptive data collection, and data subset selection. During his Ph.D., Steve primarily focused on active learning: how to effectively select maximally informative data to collect or annotate. During his postdoc, Steve has worked on data pruning: choosing a small subset of a dataset such that models trained on that subset enjoy the same performance as models trained on the full dataset.

Steve is applying for academic research jobs this year (2022-2023)!

IFDS Workshop on Distributional Robustness in Data Science

A number of domain applications of data science, machine learning, mathematical optimization, and control have underscored the importance of the assumptions on the data generating mechanisms, changes, and biases. Distributional robustness has emerged as one promising framework to address some of these challenges. This topical workshop under the auspices of the Institute for Foundations of Data Science, an NSF TRIPODS institute, will survey the mathematical, statistical, and algorithmic foundations as well as recent advances at the frontiers of this research area. The workshop features invited talks as well as shorter talk by junior researchers and a social event to foster further discussions.

MORE INFO

REGISTER HERE

IFDS Workshop on Distributional Robustness in Data Science

August 4-6, 2022

University of Washington, Seattle, WA

A number of domain applications of data science, machine learning, mathematical optimization, and control have underscored the importance of the assumptions on the data generating mechanisms, changes, and biases. Distributional robustness has emerged as one promising framework to address some of these challenges. This topical workshop under the auspices of the Institute for Foundations of Data Science, an NSF TRIPODS institute, will survey the mathematical, statistical, and algorithmic foundations as well as recent advances at the frontiers of this research area. The workshop features invited talks as well as shorter talk by junior researchers and a social event to foster further discussions.

Please see the workshop website for all the details and to register:

DRDS Workshop 2022

Central themes of the workshop include:

Risk measures and distributional robustness for decision making
Distributional shifts in real-world domain applications
Optimization algorithms for distributionally robust machine learning
Distributionally robust imitation learning and reinforcement learning
Learning theoretic statistical guarantees for distributional robustness

Photo courtesy San Juan Safaris, Mark Gardner

Ben Teo

Research interests in Statistical Phylogenetics. Working with Prof. Cecile Ane.

Thanasis Pittas

A PhD student in the Computer Sciences Department at the University of Wisconsin–Madison. Thanasis is advised by Prof. Ilias Diakonikolas. He works on theoretical machine learning and robust statistics.

Chanwoo Lee

Xufeng Cai

Nayoung Lee

Nayoung Lee (Electrical and Computer Engineering), advised by Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning theories and deep learning algorithms. Her recent work focuses on gradient free neural network pruning algorithm.

Shubham Kumar Bharti

Shubham Kumar Bharti (Computer Science), advised by Jerry Zhu (Computer Science), and Kangwook Lee (Electrical and Computer Engineering), is interested in Reinforcement Learning, Fairness and Machine Teaching. His recent work focuses on fairness problems in sequential decision making. Currently, he is working on defenses against trojan attacks in Reinforcement Learning.

Shuyan Li

Shuyao Li (Computer Science), advised by Stephen Wright (Computer Science), Jelena Diakonikolas (Computer Science), and Ilias Diakonikolas (Computer Science), works on optimization and machine learning. Currently, he focuses on second order guarantee of stochastic optimization algorithms in robust learning settings.

Jiaxin Hu

Jiaxin Hu (Statistics), advised by Miaoyan Wang (Statistics) and Jerry Zhu (Computer Science), works on statistical machine learning. Currently, she focuses on tensor/matrix data modeling and analysis with applications in neuroscience and social networks.

Max Hill

Max Hill (Mathematics), advised by Sebastien Roch (Mathematics) and Cecile Ane (Statistics), works in probability and mathematical phylogenetics. His recent work focuses on the impact of recombination in phylogenetic tree estimation.

Sijia Fang

Sijia Fang (Statistics), advised by Karl Rohe (Statistics) and Sebastian Roch (Mathematics), works on social network and spectral analysis. More specifically, she is interested in hierarchical structures in social networks. She is also interested in phylogenetic tree and network recovery problems.

Zhiyan Ding

Zhiyan Ding (Mathematics), advised by Qin Li (Mathematics), works on applied and computational mathematics. More specifically, he uses PDE analysis tools for analyzing machine learning algorithms, such as Bayesian sampling and over-parameterized neural networks. The PDE tools, including gradient flow equation, and mean-field analysis, are helpful in formulating the machine (deep) learning algorithms into certain mathematical descriptions that are easier to handle.

Jasper Lee

Jasper Lee joined UW-Madison as a postdoc in August 2021, mentored by Ilias Diakonikolas in the Department of Computer Sciences and partly supported by IFDS. He completed his PhD at Brown University, working with Paul Valiant. His thesis work revisited and settled a basic problem in statistics: given samples from an unknown 1-dimensional probability distribution, what is the best way to estimate the mean of the distribution, with optimal finite-sample and high probability guarantees? Perhaps surprisingly, the conventional method of taking the average of samples is sub-optimal, and Jasper’s thesis work provided the first provably optimal “sub-Gaussian” 1-dimensional mean estimator under minimal assumptions. Jasper’s current research focuses on revisiting other foundational statistical problems and solving them also to optimality. He is additionally pursuing directions in the adjacent area of algorithmic robust statistics.

Julian Katz-Samuels

Julian’s research focuses on designing practical machine learning algorithms that adaptively collect data to accelerate learning. My recent research interests include active learning, multi-armed bandits, black-box optimization, and out-of-distribution detection using deep neural networks. I am also very interested in machine learning applications that promote the social good.

Greg Canal

Greg Canal (gcanal@wisc.edu) joined UW-Madison in September 2021. He is doing a postdoc with Rob Nowak in Electrical and Computer Engineering, partly supported by IFDS. He completed his PhD at Georgia Tech in Electrical and Computer Engineering in 2021. For his thesis, he developed and analyzed new active machine learning algorithms inspired by feedback coding theory. During his first year as a postdoc he worked on a new approach for multi-user recommender systems, and as he continues into his second year he plans on exploring new active learning algorithms for deep neural networks.

Ahmet Alacaoglu

Ahmet Alacaoglu (alacaoglu@wisc.edu) joined UW–Madison in September 2021. He is doing a postdoc with Stephen J. Wright, partly supported by IFDS. He completed his PhD at EPFL, Switzerland in Computer and Communication Sciences in 2021. For his thesis, he developed and analyzed randomized first-order primal-dual algorithms for convex minimization and min-max optimization. During his postdoc, he is working on nonconvex stochastic optimization and investigating the connections between optimization theory and reinforcement learning.

Ahmet is on the academic job market (2022-2023)!

Chaobing Song

Lijun Ding

Lijun Ding (lijunbrianding@gmail.com) joined the University of Wisconsin-Madison (UW-Madison) in September 2022 as an IFDS postdoc with Stephen J. Wright. Before joining IFDS at UW-Madison, He was an IFDS postdoc with Dmitry Drusvyatskiy, and Maryam Fazel at the University of Washington. He obtained his Ph.D. in Operations Research at Cornell University, advised by Yudong Chen and Madeleine Udell.

Lijun’s research lies at the intersection of optimization, statistics, and data science. By exploring ideas and techniques in statistical learning theory, convex analysis, and projection-free optimization, he analyses and designs efficient and scalable algorithms for classical semidefinite programming and modern nonconvex statistical problems. He also studies the interplay between model overparametrization, algorithmic regularization, and model generalization via the lens of matrix factorization. He plans to explore a wider range of topics at this intersection during his appointment at UW-Madison.

He is on the 2022 – 23 job market!

Yiding Chen

Yiding Chen (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and collaborating with Po-ling Loh (Electrical and Computer Engineering), is doing research on adversarial machine learning and robust machine learning. In particular, he focuses on robust learning in high dimension. He is also interested in the intersection of adversarial machine learning and control theory.

Brandon Legried

Brandon Legried (Mathematics), advised by Sebastien Roch and working with Cecile Ane (Botany and Statistics), is working on mathematical statistics with applications to computational biology. He is studying evolutionary history, usually depicted with a tree. As an inference problem, it is interesting to view extant species information as observed data and reconstruct tree topologies or states of evolving features (such as a DNA sequence). We look to achieve results that extend across many model assumptions. The mathematical challenges involved lie at the intersection of probability and optimization.

Ankit Pensia

Ankit Pensia (Computer Science), advised by Po-Ling Loh (Statistics) and Varun Jog (Electrical and Computer Engineering) is working on projects in robust machine learning. His focus is on designing statistically and computationally efficient estimators that perform well even when the training data itself is corrupted. Traditional algorithms don’t perform well in the presence of noise: they are either slow or incur large error. As the data today is high-dimensional and usually corrupted, a better understanding of fast and robust algorithms would lead to better performance in scientific and practical applications.

Blake Mason

Blake Mason (Electrical and Computer Engineering), advised by Rob Nowak (Electrical and Computer Engineering) and Jordan Ellenberg (Mathematics), is investigating problems of learning from comparative data, such as ordinal comparisons and similarity/dissimilarity judgements. In particular, he is studying metric learning and clustering problems in this setting with applications to personalized education, active learning, and representation learning. Additionally, he studies approximate optimization techniques for extreme classification applications.

Yuchen Zeng

Yuchen Zeng (Computer Science), advised by Kangwook Lee (Electrical and Computer Engineering) and Stephen J. Wright (Computer Science), is interested in Trustworthy AI with a particular focus on fairness. Her recent work investigates training fair classifiers from decentralized data. Currently, she is working on developing a new fairness notion that considers dynamics.

Stephen Wright honored with a 2022 Hilldale Award

As director of the Institute for the Foundations of Data Science, Steve Wright is dedicated to the development and teaching of the complex computer algorithms that underlie much of modern society. The institute, which has been supported by more than $6 million in federal funds under Wright, brings together dozens of researchers across computer science, math, statistics and other departments to advance this interdisciplinary work. Read more…

Rodriguez elected as a 2022 ASA Fellow

We are pleased to announce that Abel Rodriguez, Professor of Statistics, has been elected as a Fellow of the American Statistical Association (ASA).

ASA Fellows are named for their outstanding contributions to statistical science, leadership, and advancement of the field. This year, awardees will be recognized at the upcoming Joint Statistical Meetings (JSM) during the ASA President’s Address and Awards Ceremony. Congratulations on this prestigious recognition!

Sebastien Roch named Fellow of Institute of Mathematical Statistics

Sebastien Roch has been named a Fellow of the Institute of Mathematical Statistics (IMS). IMS is the main scientific society for probability and the mathematical end of statistics. Among its various activities, IMS publishes the Annals series of flagship journals of probability.

IFDS researchers win Outstanding Paper Award at NeurIPS 2021

PIMS-IFDS-NSF Summer School in Optimal Transport

The PIMS- IFDS- NSF Summer School on Optimal Transport is happening at the University of Washington, Seattle, from June 19 till July 1, 2022.Registration is now open at through the PIMS event webpage. Please note that the deadline to apply for funded accommodation for junior participants is Feb15.Our main speakers are world leaders in the mathematics of optimal transport and its various fields of application.

Alfred Galichon (NYU, USA)Inwon Kim (UCLA, USA)
Jan Maas (IST, Austria)
Felix Otto (Max Planck Institute, Germany)
Gabriel Peyré (ENS, France).
Geoffrey Schiebinger (UBC, Canada)

Details and abstracts can be found on the summer school webpage.

As part of this summer school, IFDS will host a Machine Learning (ML) Day, with a focus on ML applications of Optimal Transport, that will include talks and a panel discussion. Details will be announced here later.

Wisconsin Announces Fall ’21 and Spring ’22 RAs

The UW-Madison IFDS site is funding several Research Assistants during Fall 2021 and Spring 2022 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Yuchen Zeng

Yuchen Zeng (Computer Science), advised by Kangwook Lee (Electrical and Computer Engineering) and Stephen J. Wright (Computer Science), is interested in Trustworthy AI with a particular focus on fairness. Her recent work investigates training fair classifiers from decentralized data. Currently, she is working on developing a new fairness notion that considers dynamics.

Liu Yang

Liu Yang (Computer Sciences), advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on developing the structured pruning algorithm based on path norm regularization. She is also interested in finding the sparse network at initialization that can be further trained to achieve SOTA performance.

Zhiyan Ding

Zhiyan Ding (Mathematics), advised by Qin Li (Mathematics), works on applied and computational mathematics. More specifically, he uses PDE analysis tools for analyzing machine learning algorithms, such as Bayesian sampling and over-parameterized neural networks. The PDE tools, including gradient flow equation, and mean-field analysis, are helpful in formulating the machine (deep) learning algorithms into certain mathematical descriptions that are easier to handle.

Sijia Fang

Sijia Fang (Statistics), advised by Karl Rohe (Statistics) and Sebastian Roch (Mathematics), works on social network and spectral analysis. More specifically, she is interested in hierarchical structures in social networks. She is also interested in phylogenetic tree and network recovery problems.

Max Hill

Max Hill (Mathematics), advised by Sebastien Roch (Mathematics) and Cecile Ane (Statistics), works in probability and mathematical phylogenetics. His recent work focuses on the impact of recombination in phylogenetic tree estimation.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. He focuses on developing techniques for high-dimensional, nonlinear models with model error.

Jiaxin Hu

Jiaxin Hu (Statistics), advised by Miaoyan Wang (Statistics) and Jerry Zhu (Computer Science), works on statistical machine learning. Currently, she focuses on tensor/matrix data modeling and analysis with applications in neuroscience and social networks.

Shuyan Li

Shuyao Li (Computer Science), advised by Stephen Wright (Computer Science), Jelena Diakonikolas (Computer Science), and Ilias Diakonikolas (Computer Science), works on optimization and machine learning. Currently, he focuses on second order guarantee of stochastic optimization algorithms in robust learning settings.

Shubham Kumar Bharti

Shubham Kumar Bharti (Computer Science), advised by Jerry Zhu (Computer Science), and Kangwook Lee (Electrical and Computer Engineering), is interested in Reinforcement Learning, Fairness and Machine Teaching. His recent work focuses on fairness problems in sequential decision making. Currently, he is working on defenses against trojan attacks in Reinforcement Learning.

Nayoung Lee

Nayoung Lee (Electrical and Computer Engineering), advised by Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning theories and deep learning algorithms. Her recent work focuses on gradient free neural network pruning algorithm.

Jillian Fisher

Jillian Fisher (Statistics) works with Zaid Harchaoui (Statistics) and Yejin Choi (Allen School for CSE) on dataset biases and generative modeling. She is also interested in nonparametric and semiparametric statistical inference.

Yifang Chen

Yifang Chen (CSE) works is co-advised by Kevin Jamieson (CSE) and Simon Du (CSE). Yifang is an expert on algorithm design for active learning multi-armed bandits in the presence of adversarial corruptions. Recently she has been working with Kevin and Simon on sample efficient methods for multi-task learning.

Guanghao Ye

Guanghao Ye (CSE) works with Yin Tat Lee (CSE) and Dmitriy Drusvyatskiy (Mathematics) on designing faster algorithms for non-smooth non-convex optimization. He is interested in convex optimization. He developed the first nearly linear time algorithm for linear programs with small treewidth.

Josh Cutler

Josh Cutler (Mathematics) works with Dmitriy Drusvyatskiy (Mathematics) and Zaid Harchaoui (Statistics) on stochastic optimization for data science. His most recent work has focused on designing and analyzing algorithms for stochastic optimization problems with data distributions that evolve in time. Such problems appear routinely in machine learning and signal processing.

Romain Camilleri

Romain Camilleri (CSE) works with Kevin Jamieson (CSE), Maryam Fazel (ECE), and Lalit Jain (Foster School of Business). He is interested in robust high dimensional experimental design with adversarial arrivals. He also collaborates with IFDS student Zhihan Xiong (CSE).

Yue Sun

Yue Sun (ECE) works with Maryam Fazel (ECE) and Goiehran Mesbahi (Aeronautics), as well as collaborator Prof. Samet Oymak (UC Riverside). Yue is currently interested in two topics: (1) understanding the role of over-parametrization in meta-learning, and (2) gradient-based policy update methods for control, which connects the two fields of control theory and reinforcement learning.

Lang Liu

Lang Liu (Statistics) works with Zaid Harchaoui (Statistics) and Soumik Pal (Mathematics – Kantorovich Initiative) on statistical hypothesis testing and regularized optimal transport. He is also interested in statistical change detection and natural language processing.

IFDS-MADLab Workshop

Statistical Approaches to Understanding Modern ML Methods

Aug 2-4, 2021
University of Wisconsin–Madison

When we use modern machine learning (ML) systems, the output often consists of a trained model with good performance on a test dataset. This satisfies some of our goals in performing data analysis, but leaves many unaddressed — for instance, we may want to build an understanding of the underlying phenomena, to provide uncertainty quantification about our conclusions, or to enforce constraints of safety, fairness, robustness, or privacy. As an example, classical statistical methods for quantifying a model’s variance rely on strong assumptions about the model — assumptions that can be difficult or impossible to verify for complex modern ML systems such as neural networks.

This workshop will focus on using statistical methods to understand, characterize, and design ML models — for instance, methods that probe “black-box” ML models (with few to no assumptions) to assess their statistical properties, or tools for developing likelihood-free and simulation-based inference. Central themes of the workshop may include:

Using the output of a ML system to perform statistical inference, compute prediction intervals, or quantify measures of uncertainty
Using ML systems to test for conditional independence
Extracting interpretable information such as feature importance or causal relationships
Integrating likelihood-free inference with ML
Developing mechanisms for enforcing privacy, robustness, or stability constraints on the output of ML systems
Exploring connections to transfer learning and domain adaptation
Automated tuning of hyperparameters in black-box models and derivative-free optimization

Organizers

Rina Foygel Barber

Lu Mao

Michael Newton

Robert Nowak

Rebecca Willett

Participants

Osbert Bastani

PAC prediction sets under distribution shift

Avrim Blum

Recovering from biased data: Can fairness constraints improve accuracy

Kamalika Chaudhuri

Statistical challenges in adversarially robust machine learning

Kyle Cranmer

Simulation-based inference: recent progress and open questions

Peng Ding

Model-assisted analyses of cluster-randomized experiments

Dylan Foster

From predictions to decisions: A black-box approach to the contextual bandit problem

Zaid Harchaoui

The statistical trade-offs of generative modeling with deep neural networks

Lucas Janson

Floodgate: Inference for variable importance with machine learning

Edward Kennedy

Optimal doubly robust estimation of heterogeneous causal effects

Lihua Lei

Conformalized survival analysis

Sharon Li

Uncovering the unknowns of deep neural networks: Challenges and opportunities

Po-Ling Loh

Robust W-GAN-based estimation under Wasserstein contamination

Aaditya Ramdas

A quick tour of distribution-free post-hoc calibration

Hanie Sedghi

The deep bootstrap framework: Good online learners are good offline generalizers

Ryan Tibshirani

Discrete splines: Another look at trend filtering and related problems

Vladimir Vovk

Conformal prediction, testing, and robustness

Yao Xie

Conformal prediction intervals for dynamic time series

Schedule

Monday

Morning	Conformal Prediction Methods	Speaker
9:15-9:30	Welcome and Introduction
9:30-10:15	*Conformal prediction, testing, and robustness*	Vladimir Vovk
10:15-11:00	*Conformalized Survival Analysis*	Lihua Lei
11:00-11:15	Break
11:15-12:00	*Conformal prediction intervals for dynamic time series*	Yao Xie
Afternoon	Challenges and Trade-offs in Deep Learning	Speaker
2:00-2:45	The Statistical Trade-offs of Generative Modeling with Deep Neural Networks	Zaid Harchaoui
2:45-3:30	The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers	Hanie Sedghi
3:30-3:45	Break
3:45-4:30	Uncovering the Unknowns of Deep Neural Networks: Challenges and Opportunities	Sharon Li
5:00-8:00	Reception at Tripp Commons, Memorial Union

Tuesday

Morning	Robust Learning	Speaker
9:30-10:15	Optimal doubly robust estimation of heterogeneous causal effects	Edward Kennedy
10:15-11:00	Robust W-GAN-Based Estimation Under Wasserstein Contamination	Po-Ling Loh
11:00-11:15	Break
11:15-12:00	PAC Prediction Sets Under Distribution Shift	Osbert Bastani
Afternoon	Interpretation in Black-Box Learning	Speaker
2:00-2:45	Floodgate: Inference for Variable Importance with Machine Learning	Lucas Janson
2:45-3:30	From Predictions to Decisions: A Black-Box Approach to the Contextual Bandit Problem	Dylan Foster
3:30-3:45	Break
3:45-5:00	Lightning talks by in-person participants
	Searching for Synergy in High-Dimensional Antibiotic Combinations	Jennifer Brennan (Seattle)
	Supervised tensor decomposition with features on multiple modes	Jiaxin Hu (Madison)
	Latent Preference Matrix Estimation with Graph Side Information	Changhun Jo (Madison)
	Nonconvex Factorization and Manifold Formulations are Almost Equivalent in Low-rank Matrix Optimization	Yuetian Luo (Madison)
	Excess Capacity and Backdoor Poisoning	Naren Manoj (TTIC)
	Risk bounds for regression and classification with structured feature maps	Andrew McRae (GaTech)
	Robust regression with covariate filtering: Heavy tails and adversarial contamination	Ankit Pensia (Madison)
	Derandomizing knockoffs	Zhimei Ren (Chicago)

Wednesday

Morning	Modern Statistical Methodologies pt I	Speaker
9:30-10:15	Discrete Splines: Another Look at Trend Filtering and Related Problems	Ryan Tibshirani
10:15-11:00	A quick tour of distribution-free post-hoc calibration	Aaditya Ramdas
11:00-11:15	Break
11:15-12:00	Recovering from Biased Data: Can Fairness Constraints Improve Accuracy?	Avrim Blum
Afternoon	Modern Statistical Methodologies pt II	Speaker
2:00-2:45	*Model-assisted analyses of cluster-randomized experiments*	Peng Ding
2:45-3:30	Simulation-based inference: recent progress and open questions	Kyle Cranmer
3:30-3:45	Break
3:45-4:30	Statistical challenges in Adversarially Robust Machine Learning	Kamalika Chaudhuri

Slides

All slides for the workshop are included in the gallery below. Click on a poster to see it in full screen.

IFDS 2021 Summer School

Monday, July 26 – Friday, July 30

Venue

This summer school will be a hybrid event. All talks will be streamed live online to registered participants. A limited in-person option will take place at the University of Wisconsin-Madison.

Sponsor

Summer School is sponsored by Institute for Foundations of Data Science (IFDS) at the University of Washington, the University of Wisconsin-Madison, the University of California-Santa Cruz, and the University of Chicago as well as the UW-Madison University Center of Excellence for Efficient and Robust Machine Learning, https://madlab.ml.wisc.edu. The IFDS is supported by the NSF TRIPODS Phase 2 program.

Description

This summer school will introduce participants to a broad range of cutting-edge areas in modern data science with an eye towards fostering cross-disciplinary research. Talks will cover both necessary foundational concepts from statistics, computer science and mathematics as well as latest developments.

Pre-requisites

The intended audience for the summer school are advanced graduate students and postdoctoral researchers with a background in statistics, computer science, mathematics or related fields.

Registration

Prospective attendees are asked to submit an application form. Registration for the summer school is required but free.

In-Person Option: An in-person option at the University of Wisconsin-Madison is available. Due to current physical distancing guidelines, attendance will be limited. Priority for in-person participation will be given to local UW-Madison applicants. The deadline to apply for the in-person option is JULY 2. Applicants will be notified by JULY 9. If we are unable to accommodate your in-person application, you will automatically be registered for the virtual option. In any case, links to the virtual stream will also be made available to the in-person participants.

Register Here

Organizers

Sébastien Roch

Rob Nowak

Speakers and Topics

Cécile AnéStatistics & Botany, UW–Madison

Stochastic processes and algorithms for phylogenetics

Rina Foygel BarberStatistics, U of Chicago

Lecture 1: Distribution-free inference: aims and algorithms
Lecture 2: Distribution-free inference: limits and open questions

Sebastien BubeckMicrosoft Research

Lecture 1: Reminders on neural networks and high-dimensional probability
Lecture 2: A universal law of robustness via isoperimetry

Ilias DiakonikolasComputer Sciences, UW-Madison

Algorithmic Robust Statistics in High Dimensions

Kevin JamiesonComputer Science & Engineering, U Washington

Experimental Design for Linear Dynamical Systems

Dimitris Papailiopoulos, Electrical & Computer Engineering, UW–Madison

A Summer Catalogue of Lottery Prizes: Finding Everything Within Random Nets

Jose Israel RodriguezMathematics, UW–Madison

Sparse polynomial system solving and the method of moments

Karle RoheStatistics, UW–Madison

Interpretable embeddings with Varimax for graphs, text, and other things
+Lab session

Miaoyan WangStatistics, UW–Madison

Statistical foundations and algorithms for tensor learning

Schedule

(all times CDT)

Click speaker name for video

Monday, July 26th

Time	Speaker
10:30-11:30	Karl Rohe
1:00-2:00	Karl Rohe
2:30-3:30	Jose Rodriguez
4:00-5:00	Jose Rodriguez
5:15-6:30	Poster Session

Tuesday, July 27th

Time	Speaker
10:30-11:30	Cécile Ané
1:00-2:00	Cécile Ané
2:30-3:30	Miaoyan Wang
4:00-5:00	Miaoyan Wang
5:15-6:30	Poster Session

Wednesday, July 28th

Time	Speaker
10:30-11:30	Dimitris Papailiopoulos
1:00-2:00	Dimitris Papailiopoulos
2:30-3:30	Sebastien Bubeck
4:00-5:00	Sebastien Bubeck

Thursday, July 29th

Time	Speaker
10:30-11:30	Kevin Jamieson
1:00-2:00	Kevin Jamieson
2:30-3:30	Ilias Diakonikolas
4:00-5:00	Ilias Diakonikolas

Friday, July 30th

Time	Speaker
10:30-11:30	Rina Foygel-Barber
1:00-2:00	Rina Foygel-Barber

AI4All @ Washington

The UW Instance of AI4ALL is an AI4ALL-sanctioned program run under the Taskar Center for Accessible Technology, an initiative by the UW Paul G. Allen School for Computer Science & Engineering. This year we were joined by faculty from IFDS at UW in teaching the summer cohort. More informations at <a href=”https://ai4all.cs.washington.edu/about-us/”>ai4all.cs.washington.edu</a>.

IFDS Summer School 2021: Registration Open!

The IFDS Summer School 2021, co-sponsored by MADLab, will take place on July 26-July 30, 2021. The summer school will introduce participants to a broad range of cutting-edge areas in modern data science with an eye towards fostering cross-disciplinary research. We have a fantastic lineup of lecturers:

* Cécile Ané (UW-Madison)
* Rina Foygel Barber (Chicago)
* Sebastien Bubeck (Microsoft Research)
* Ilias Diakonikolas (UW-Madison)
* Kevin Jamieson (Washington)
* Dimitris Papailiopoulos (UW-Madison)
* Jose Rodriguez (UW-Madison)
* Karle Rohe (UW-Madison)
* Miaoyan Wang (UW-Madison)

Registration for both virtual and in-person participation is now open. More details at: https://ifds.info/ifds-2021-summer-school/

IFDS 2021 Summer School Registration

Six Cluster Hires in Data Science at UW–Madison

IFDS is delighted to welcome six new faculty members at UW-Madison working in areas related to fundamental data science. All were hired as a result of a cluster hiring process proposed by the IFDS Phase 1 leadership at Wisconsin in 2017 and approved by UW-Madison leadership in 2019. Although the cluster hire was intended originally to hire only three faculty members in the TRIPODS areas of statistical, mathematical, and computational foundations of data science, three extra positions funded through other faculty lines were filled during the hiring process.

One of the new faculty members, Ramya Vinayak, joined the ECE Department in 2020. The other five will join in Fall 2021. They are:

Yudong Chen

Computer Science

Sameer Deshpande

Statistics

Yinqui He

Statistics

Hanbaek Lyu

Mathematics

Qiaomin Xie

ISyE

Ramya Vinayak

ECE

These faculty will bring new strengths to IFDS and UW-Madison. We’re excited that they are joining us, and look forward to working with them in the years ahead!

Research Highlights: Robustness meets Algorithms

IFDS Affiliate Ilias Diakonikolas and collaborators recently published a survey article titled “Robustness Meets Algorithms” in the Research Highlights section of the Communications of the ACM. The article presents a high-level description of groundbreaking work by the authors, which developed the first robust learning algorithms for high-dimensional unsupervised learning problems, including for example robustly estimating the mean and covariance of a high-dimensional Gaussian (first published in FOCS’16/SICOMP’19). This work resolved a long-standing open problem in statistics and at the same time was the beginning of a new area, now known as “algorithmic robust statistics”. Ilias is currently finishing a book on the topic with Daniel Kane to be published by Cambridge University Press.

More Info…

TRIPODS PI meeting

Fall 20 Wisconsin RAs

A gathering of leaders of the institutes funded under NSF’s TRIPODS program since 2017

Venue: Zoom and gather.town.

Thu 6/10/21: 11:30-2 EST and 3-5:30 EST (5 hours) • Fri 6/11/21: 11:30-2 EST and 3-5:30 EST (5 hours)

Format

One 15-minute “research highlight” talk from each of the current 15 HDR Phase 1s and two such talks from the 2 Phase 2s. (13 mins + 2 min questions each). [6 hours approx]
Each talk focuses on one particular research topic. It’s NOT a survey talk or overview of TRIPODS research at your institute – that belongs in the poster session.
Panel session on “major challenges in fundamental data science” with 5 panelists – 2 from the Phase 2s and 3 from Phase 1. [1 hour]
Panel of experts from industry on “adoption of fundamental data science methodology in industry – trends and needs” [1 hour]
Poster session: one per institute, sketching the range of activities undertaken by that institution, in research / education / outreach. [1.25 hours]
Posters will be available for browsing in gather.town throughout the meeting.

Panel Nominations:

If you wish to be a panelist in the “major challenges” panel, please submit a 200-word abstract to the meeting co-chairs by 5/20/21. (If we receive more nominations than there are slots, a selection process involving all currently-funded TRIPODS institutes will be conducted.)

Panelists in the “industry” panel should be industry affiliates of a currently funded TRIPODS institute, or possibly members of an institute with a joint academic-industry appointment. Please mail your suggestions to the organizers, with a few words on what topics the nominee could cover.

Abel Rodriguez – Co-Chair
Steve Wright – Co-chair

Poster Submission:

Please submit your poster (one per institute) by

June 7, 2021.

Submit Poster

Schedule

(all times EDT)

Thursday 6/10/2021

Time (EDT)	Topic	Presenter
11:30-12:00	Introduction & NSF Presentation	Margaret Martonosi (CISE), Dawn Tilbury (ENG), Sean Jones (MPS)
12:00-12:15	Talk: IFDS (I)	Jelena Diakonikolas
12:15-12:30	Talk: FODSI (1)	Stefanie Jegelka
12:30-12:45	Talk: IDEAL	Ali Vakilian
12:45-1:00	Break
1:00-1:15	Talk: GDSC	Aaron Wagner
1:15-1:30	Talk: Tufts Tripods	Misha Kilmer
1:30-1:45	Talk: Rutgers – DATA-INSPIRE	Konstantin Mischaikow
1:45-2:00	Talk: UMass	Patrick Flaherty
2:00-3:00	Lunch
3:00-4:00	PANEL: Major Challenges in Fundamental Data Science
4:00-4:15	Break
4:15-5:30	POSTER SESSION	gather.town

List of Institutes: https://nsf-tripods.org

Friday 6/11/2021

Time (EDT)	Topic	Presenter
11:30-11:45	Talk: JHU
11:45-12:00	Talk: FODSI (II)	Nika Haghtalab
12:00-12:15	Talk: IFDS (II)	Kevin Jamieson
12:15-12:30	Talk: PIFODS	Hamed Hassani
12:30-12:45	Break
12:45-1:00	Talk: TETRAPODS UC-Davis	Rishi Chaudhuri
1:00-1:15	Talk: UIC	Anastasios Sidiropoulos
1:15-1:30	Talk: D4 Iowa State	Pavan Aduri
1:30-1:45	Talk: UIUC	Semih Cayci
1:45-2:00	Talk: Duke	Sayan Mukherjee
2:00-3:00	Lunch
3:00-3:15	Talk: UT-Austin	Sujay Sanghavi
3:15-3:30	Talk: FIDS	Simon Foucart
3:30-3:45	Talk: FINPenn	Alejandro Ribiero
3:45-4:45	PANEL: Industry Adoption of Fundamental Data Science

Rebecca Willett honored by SIAM

Rebecca Willett, IFDS PI at the University of Chicago, has been selected as a SIAM Fellow for 2021. She was recognized “For her contributions to mathematical foundations of machine learning, large-scale data science, and computational imaging.” https://www.siam.org/research-areas/detail/computational-science-and-numerical-analysis

In addition, her achievements were spotlighted as part of SIAM’s Mathematics and Statistics Awareness Month: https://sinews.siam.org/Details-Page/honoring-dr-rebecca-willett

IFDS team analyzes COVID-19 spread in Dane and Milwaukee county

A UW-Madison IFDS team (Nan Chen, Jordan Ellenberg, Xiao Hou and Qin Li), together with Song Gao, Yuhao Kang and Jingmeng Rao (UW-Madison, Geography), Kaiping Chen (UW-Madison, Life Sciences Communication) and Jonathan Patz (Global Health Institute), recently studied the COVID-19 spreading pattern, and its correlation with business foot traffic, race and ethnicity and age structure of sub-regions within Dane and Milwaukee county. The results are published on Proceedings of the National Academy of Sciences of the United States of America. (https://www.pnas.org/content/118/24/e2020524118)

A human mobility flow-augmented stochastic SEIR model was developed. When the model was combined with data assimilation and machine learning techniques, the team reconstructed the historical growth trajectories of COVID-19 infection in both counties. The results reveal different types of spatial heterogeneities (e.g., varying peak infection timing in different subregions) even within a county, suggesting a regionalization-based policy (e.g, testing and vaccination resource allocation) is necessary to mitigate the spread of COVID-19, and to prepare for future epidemics.

Click illustration to expand

Data science workshops co-organized by IFDS members

Mary Silber

Po-Ling Loh

Jelena Diakonikolas

Rebecca Willett

Xiaoxia Wu

IFDS members have been active leaders in the data science community over the past year, organizing workshops with participants reflecting all core TRIPODS disciplines, covering a broad range of career stages, and advancing the participation of women. IFDS executive committee member Rebecca Willett co-organized a workshop on the Multifaceted Complexity of Machine Learning at the NSF Institute of Mathematical and Statistical Innovation, which featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas as speakers. With the support of the Institute for Advanced Study Women and Mathematics Ambassador program, IFDS postdoc Xiaoxia (Shirley) Wu organized the Women in Theoretical Machine Learning Symposium, which again featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas. Finally, IFDS faculty Rebecca Willett and Mary Silber co-organized the Graduate Research Opportunities for Women 2020 conference, which is aimed at female-identified undergraduate students who may be interested in pursuing a graduate degree in the mathematical sciences. The conference is open to undergraduates from U.S. colleges and universities, including international students. IFDS faculty members Rina Barber and Rebecca Willett were featured speakers.

Wisconsin funds 8 Summer RAs

For the first time, IFDS @ Wisconsin is sponsoring a Summer RA program in 2021. As with RAs sponsored under the usual Fall and Spring semester programs, Summer 2021 RAs will be working with members of IFDS at Wisconsin and elsewhere on fundamental data science research. They will also be participating and assisting with IFDS’s activities during the summer, including the Summer School and Workshop to be held in Madison in July and August 2021.

The Summer 2021 RAs are:

Xufeng CaiComputer Sciences

Advised by Jelena Diakonikolas

Changhun JoMathematics

Advised by Kangwook Lee

Chanwoo LeeStatistics

Advised by Miaoyan Yang

Thanasis PittasComputer Sciences

Advised by Ilias Diakonikolas

Ben TeoStatistics

Advised by Cecile Ane

Liu YangComputer Sciences

Advised by Rob Nowak

Shuqi YuMathematics

Advised by Sebastien Roch

Xuezhou ZhangComputer Sciences

Advised by Jerry Zhu

IFDS team of RAs and Affiliates wins Best Student Paper Award from the American Statistical Association

IFDS Wisc RA Rungang Han, in joint work with IFDS RA Yuetian Luo and Affiliates Anru Zhang and Miaoyan Wang, has been awarded a Best Student Paper Award from the Statistical Learning and Data Science Section of the American Statistical Association for the paper “Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit.” Rungang will present the paper in August at the Joint Statistical Meetings (JSM) 2021: https://community.amstat.org/slds/awards/student-paper-award

This paper develops an efficient and statistically optimal two-stage algorithm for multi-way high-order clustering motivated by multi-tissue gene expression analysis and dynamic network analysis. High-order clustering aims to identify heterogeneous substructure in the multiway datasets that arise commonly in neuroimaging, genomics, and social network studies. The non-convex and discontinuous nature of the problem poses significant challenges in both statistics and computation. This work proposes a tensor block model and the computationally efficient methods, high-order Lloyd algorithm and high-order spectral clustering, for high-order clustering in the tensor block model. The convergence of the proposed procedure is established, and their method is proved to achieve exact clustering under some mild assumptions. This work also gives the complete characterization for the statistical-computational trade-off in high-order clustering based on three different signal-to-noise ratio regimes. The proposed procedures are evaluated via extensive experiments on both synthetic datasets and real data examples in the flight route network and online click-through prediction.

Data Science Day 2021

Uses and Abuses of Data in Higher Education
May 14, 2021
8:45 am – 3:00 pm
Virtual Event via Zoom with Opportunities for Engagement

Details and registration: https://citl.ucsc.edu/data-science-day-2021/

Washington adds 6 RAs for the Spring 2021 term

The University of Washington site of IFDS is funding new Research Assistants in the Spring 2021 term to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

Jillian Fisher

Jillian Fisher (Statistics) works with Zaid Harchaoui (Statistics) and Yejin Choi (Allen School for CSE) on dataset biases and generative modeling. She is also interested in nonparametric and semiparametric statistical inference.

Yifang Chen

Yifang Chen (CSE) works is co-advised by Kevin Jamieson (CSE) and Simon Du (CSE). Yifang is an expert on algorithm design for active learning multi-armed bandits in the presence of adversarial corruptions. Recently she has been working with Kevin and Simon on sample efficient methods for multi-task learning.

Guanghao Ye

Guanghao Ye (CSE) works with Yin Tat Lee (CSE) and Dmitriy Drusvyatskiy (Mathematics) on designing faster algorithms for non-smooth non-convex optimization. He is interested in convex optimization. He developed the first nearly linear time algorithm for linear programs with small treewidth.

Josh Cutler

Josh Cutler (Mathematics) works with Dmitriy Drusvyatskiy (Mathematics) and Zaid Harchaoui (Statistics) on stochastic optimization for data science. His most recent work has focused on designing and analyzing algorithms for stochastic optimization problems with data distributions that evolve in time. Such problems appear routinely in machine learning and signal processing.

Romain Camilleri

Romain Camilleri (CSE) works with Kevin Jamieson (CSE), Maryam Fazel (ECE), and Lalit Jain (Foster School of Business). He is interested in robust high dimensional experimental design with adversarial arrivals. He also collaborates with IFDS student Zhihan Xiong (CSE).

Yue Sun

Yue Sun (ECE) works with Maryam Fazel (ECE) and Goiehran Mesbahi (Aeronautics), as well as collaborator Prof. Samet Oymak (UC Riverside). Yue is currently interested in two topics: (1) understanding the role of over-parametrization in meta-learning, and (2) gradient-based policy update methods for control, which connects the two fields of control theory and reinforcement learning.

Wisconsin funds 7 RAs for the Spring 21 Semester

The UW-Madison site of IFDS is funding several Research Assistants during Spring semester 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Washington adds 6 RAs to Winter 2021 term

The University of Washington site of IFDS is funding six Research Assistants in Winter quarter 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Steve Wright and collaborators win NeurIPS Test of Time Award 2020

Stephen Wright and three colleagues were announced winners of the Test of Time Award at the 2020 Conference on Neural Information Processing Systems (NeurIPS). The award is for the paper judged most influential from NeurIPS 2009, 2010, and 2011.

Wright and his coauthors received the award for their 2011 paper “Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.” The paper proposed an alternative way to implement Stochastic Gradient Descent (SGD) without any locking of memory access, that “outperforms alternative schemes that use locking by an order of magnitude.” SGD is the algorithm that drives many machine learning systems.

See this [12-minute talk](https://youtu.be/c5T7600RLPc) by Chris Re about the paper.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Postdoctoral fellows at IFDS@UW-Madison

The IFDS at UW-Madison is seeking applicants for one or two postdoctoral positions. Successful candidates will conduct research with members of the IFDS, both at UW-Madison and its partner sites, and will be based at the Wisconsin Institute of Discovery at UW-Madison (https://wid.wisc.edu/). A unique benefit of this position is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Travel funds will be provided. The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in data science. Additional desirable qualities include the ability to work effectively both independently and in a team, good communication skills; and a record of interdisciplinary collaborations.
To apply: https://drive.google.com/file/d/1LgeZ7fZ1cg0icuohCGAWxZlz8T9Jj66M/view?usp=sharing

IFDS Postdoctoral Fellow Positions Now Available U Wash

The newly-founded NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. A unique benefit is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

To read more and to apply, please see http://ads-institute.uw.edu/IFDS/postdoc.htm

Cluster faculty positions at UW-Madison

The University of Wisconsin-Madison seeks to hire several faculty members with research and teaching interests in the foundations of data science. These positions are part of a campus-wide cluster initiative to expand and broaden expertise in the fundamentals of data science at UW-Madison. Three faculty positions will be hired in the cluster, which will build a world-class core of interdisciplinary strength in the mathematical, statistical, and algorithmic foundations of data science. This hiring initiative involves the Departments of Mathematics, Statistics, Computer Sciences, Electrical and Computer Engineering, Industrial and Systems Engineering, and the Wisconsin Institute for Discovery. It is anticipated that successful candidates will develop strong collaborations with existing data science research programs at UW-Madison — including IFDS — and with external research centers. To apply, follow the links below:
Mathematics: https://www.mathjobs.org/jobs/list/16747
Statistics: https://jobs.hr.wisc.edu/en-us/job/506886/assistant-professor-associate-professor-professor-in-statistics-cluster-hire
Engineering: https://jobs.hr.wisc.edu/en-us/job/506918/professor-cluster-hire

Wisconsin Adds Eight RA’s for IFDS

The UW-Madison site of IFDS is funding eight Research Assistants Fall semester to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Read about Fall 2020 RA’s at https://ifds.info/fall-20-wisconsin-ras/

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Yin Tat Lee earns Packard Fellowship

Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows.

Read the full article at: http://ads-institute.uw.edu/IFDS/news/2020/10/15/packard/

UW launches Institute for Foundations of Data Science

The University of Washington will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

https://www.washington.edu/news/2020/09/01/uw-launches-institute-for-foundations-of-data-science/

UW–Madison to Continue Fundamental Data Science Research with Phase II Award from NSF

The Wisconsin Institute for Discovery is home to the Institute for the Foundations of Data Science, which has received Phase II funding from the National Science Foundation. https://wid.wisc.edu/uw-madison-to-continue-fundamental-data-science-research-with-phase-ii-award-from-nsf/

UChicago Joins Three Universities in Institute for Foundational Data Science

The Transdisciplinary Research In Principles Of Data Science (TRIPODS) program of the National Science Foundation supports this work by bridging these three disciplines for research and training. In its first round of Phase II funding, the TRIPODS program awarded $12.5 million to the Institute for Foundations of Data Science (IFDS), a four-university collaboration among the Universities of Washington, Wisconsin-Madison, California Santa Cruz, and Chicago. https://computerscience.uchicago.edu/news/article/tripods-ifds/

New data science institute includes a focus on ethics and algorithms

UC Santa Cruz will join three other institutions to establish a transdisciplinary research institute bringing together mathematicians, statisticians, and theoretical computer scientists to develop the theoretical foundations of the fast-growing field of data science.

Read the full article at: https://news.ucsc.edu/2020/09/data-science-institute.html

NSF Tripods Phase II Award

The University of Washington (UW) will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

The Institute for Foundations of Data Science (IFDS) is a collaboration between the UW and the Universities of Wisconsin-Madison, California Santa Cruz, and Chicago, with a mission to develop a principled approach to the analysis of ever-larger, more complex and potentially biased data sets that play an increasingly important role in industry, government and academia.

Support for the IFDS comes from a $12.5 million grant from the National Science Foundation and its Transdisciplinary Research in Principles of Data Science, or TRIPODS, program. Today, the NSF named IFDS as one of two institutes nationwide receiving the first TRIPODS Phase II awards. TRIPODS is tied to the NSF’s Harnessing the Data Revolution (HDR) program, which aims to accelerate discovery and innovation in data science algorithms, data cyberinfrastructure and education and workforce development.

The five-year funding plan for the IFDS Phase II includes support for new research projects, workshops, a partnership across the four research sites and students and postdoctoral scholars co-advised by faculty from different fields. Plans for education and outreach will draw on previous experience of IFDS members and leverage institutional resources at all four sites.

Maryam Fazel (PI) becomes first recipient of the Moorthy Family Inspiration Career Development Professorship

In recognition of her outstanding, innovative work as a researcher and educator, Maryam Fazel (IFDS PI) was recently named the inaugural recipient of the Moorthy Family Inspiration Career Development Professorship. This generous endowment was established in 2019 by Ganesh and Hema Moorthy for the purposes of recruiting, rewarding and retaining UW ECE faculty members who have demonstrated a significant amount of promise early on in their careers.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Deadline Extended!

August 4-6, 2022

University of Washington, Seattle, WA

Please see the workshop website for all the details and to register:

Yuchen Zeng

Liu Yang

Zhiyan Ding

Sijia Fang

Max Hill

Jeffrey Covington

Jiaxin Hu

Shuyan Li

Shubham Kumar Bharti

Nayoung Lee

Statistical Approaches to Understanding Modern ML Methods

Aug 2-4, 2021 University of Wisconsin–Madison

Organizers

Rina Foygel Barber

Lu Mao

Michael Newton

Robert Nowak

Rebecca Willett

Participants

Schedule

Monday

Tuesday

Wednesday

Aug 2-4, 2021
University of Wisconsin–Madison