Tianxiao Shen

Tianxiao Shen is an IFDS postdoc scholar at the University of Washington, working with Yejin Choi and Zaid Harchaoui. Previously, she received her PhD from MIT, advised by Regina Barzilay and Tommi Jaakkola. Before that, she did her undergrad at Tsinghua University, where she was a member of the Yao Class.

Tianxiao has broad interests in natural language processing, machine learning, and deep learning. More specifically, she studies language models and develops algorithms to facilitate efficient, accurate, diverse, flexible, and controllable text generation.

Hanbaek Lyu paper featured in Nature Communications

Hanbaek Lyu, Assistant Professor of Mathematics, University of Wisconsin–Madison and IFDS faculty, was published in the January 3, 2023 issue of Nature Communications. His article Learning low-rank latent mesoscale structures in networks is summarized below.

Introduction: 

We present a new approach to describe low-rank mesoscale structures in networks. We find that many real-world networks possess a small set of `latent motifs’ that effectively approximate most subgraphs at a fixed mesoscale. Our work has applications in network comparison and network denoising.

Content:

Researchers in many fields use networks to encode interactions between entities in complex systems. To study the large-scale behavior of complex systems, it is useful to examine mesoscale structures in networks as building blocks that influence such behavior. In many studies of mesoscale structures, subgraph patterns (i.e., the connection patterns of small sets of nodes) have been studied as building blocks of network structure at various mesoscales. In particular, researchers often identify motifs as k-node subgraph patterns (where k is typically between 3 and 5) of a network. These patterns are unexpectedly more commonthan in a baseline network (which is constructed through a random process). In the last two decades, the study of motifs has yielded insights into networked systems in many areas, including biology, sociology, and economics. However, not much is known about how to use such motifs (or related mesoscale structures), after their discovery, as building blocks to reconstruct a network. 

Networks have low-rank subgraph structures 

The figure above shows examples of 20-node subgraphs that include a path (in red) that spans the entire subgraph. These subgraphs are subsets of various real-world and synthetic networks. The depicted subgraphs have diverse connection patterns, which depend on the structure of the original network. One of our key findings is that, roughly speaking, thesubgraphs have specific pattern. Specifically, we find that many real-world networks possess a small set of latent motifs that effectively approximate most subgraphs at a fixed mesoscale. The figure below shows “network dictionaries” of 25 latent motifs for Facebook “friendship” networks from the universities UCLA and Caltech Facebook. By using subgraph sampling and nonnegative matrix factorization, we are able to discover these latent motifs. 

The ability to encode and reconstruct networks using a small set of latent motifs has many applications in network analysis, including network comparison, network denoising, and edge inference. Such low-rank mesoscale structures allow one to reconstruct networks by approximating subgraphs of a network using combinations of latent motifs. In the animation below, we demonstrate our network-reconstruction algorithm using latent motifs. First, we repeatedly sample a k-node subgraph by sampling a path of nodes. We then use the latent motifs to approximate the sampled subgraph. (See panels a1a3 in the figure below.) We then replace the original subgraph with the approximated subgraph. By doing this over and over, we gradually form a weighted reconstructed network.  The weights indicate how confident we are that the associated edges of the reconstructed network are also edges of the original observed network.

Selecting appropriate latent motifs is essential for our approach. These latent motifs act as building blocks of a network; they help us recreate different parts of it. If we don’t pick an appropriate set of latent motifs, we cannot expect to accurately capture thestructure of a network. This is analogous tousing the correct puzzle pieces toassemble a picture If weuse the wrong pieces, we will assemble a picture that doesn’t match original picture.

Motivating Application: Anomalous-subgraph detection

A common problem in network analysis is the detection of anomalous subgraphs of a network. The connection pattern of an anomalous subgraph distinguishes it from the rest of a network. This anomalous-subgraph-detection problem has numerous high-impact applications, including in security, finance, and healthcare. 

A simple conceptual framework for anomalous-subgraph detection is the following: Learn “normal subgraph patterns” in an observed network and then detect subgraphs in the observed network that deviate significantly from them. We can turn this high-level idea into a concrete algorithm by using latent motifs and network reconstruction, as the figure above illustrates. From an observed network (panel a), which consists of the original network (panel b) and an anomalous subgraph (panel c), we compute latent motifs (panel d) that can successfully approximate the k-node subgraphs of the observed network. A key observation is that these subgraphs should also describe the normal subgraph patterns of the observed network. Reconstructing the observed network using its latent motifs yields a weighted network (panel e) in which edges with positive and small weights deviate significantly from the normal subgraph patterns, which are captured by the latent motifs. Therefore, it is likely that these edges are anomalous. The suspicious edges (panel f) are the edges in the weighted reconstructed network that have positive weights that are less than a threshold. One can determine the threshold using a small set of known true edges and known anomalous edges. The suspicious edges match well with the anomalous edges (panel c). 

Our method can also be used for “link-prediction” problems, in which one seeks to figure out the most likely new edges given an observed network structure. The reasoning is similar to that for anomalous-subgraph detection. We learn latent motifs from an observed network and use them to reconstruct the network. The edges with the largest weights in the reconstructed network that were non-edges in the observed network are our predictions as the most likely edges. As we show in the figure below, our latent-motif approach is competitive with popular methods for anomalous-subgraph detection and link prediction. 

Conclusion 

We introduced a mesoscale network structure, which we call latent motifs, that consists of k-node subgraphs that are building blocks of the connected k-node subgraphs of a network. By using combinations of latent motifs, we can approximate k-node subgraphs that are induced by uniformly random k-paths of a network. We also established algorithmically and theoretically that one can accurately approximate a network if one has a dictionary of latent motifs that can accurately approximate mesoscale structures of the network.

Our computational experiments demonstrate that latent motifs can have distinctive network structures and that various social, collaboration, and protein–protein interaction networks have low-rank mesoscale structures, in the sense that a few learned latent motifs are able to reconstruct, infer, and denoise the edges of a network. We hypothesize that such low-rank mesoscale structures are a common feature of networks beyond the examined networks.

IFDS Postdoctoral Fellow Positions Available at the University of Washington

Deadline Extended!

The NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. A unique benefit is the rich set of collaborative opportunities available. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in machine learning and data science. Desirable qualities include the ability to work effectively both independently and in a team, good communication skills, and a record of interdisciplinary collaborations.

Interested applicants should send a statement of research interests, CV, and a cover letter to the address ifds.UW.positions@gmail.com, with the subject line “<applicant’s first and last name> Application”. Applicants should also arrange for two letters of reference to be sent to this same address (directly from letter writers), with the subject line “<applicant’s first and last name> Letter”.

In their cover letter, applicants should make sure to indicate potential faculty mentors (primary and secondary) at UW IFDS (see here for UW core faculty that can serve as main mentors, and here for broader affiliate members).

Full consideration will be given to applications received by January 30, 2024. The expected start date is in summer 2024, but earlier start dates will be considered. Selected candidates will be invited to short virtual (Zoom) interviews. For questions regarding the position, please contact the IFDS director Prof. Maryam Fazel at mfazel@uw.edu.

The base salary range for this position will have a full-time monthly salary range of $5880 – $6580 per month, commensurate with experience and qualifications, or as mandated by a U.S. Department of Labor prevailing wage determination.

Equal Employment Opportunity Statement
The University of Washington is an affirmative action and equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, national origin, sex, sexual orientation, marital status, pregnancy, genetic information, gender identity or expression, age, disability, or protected veteran status.

Commitment to Diversity
The University of Washington is committed to building diversity among its faculty, librarian, staff, and student communities, and articulates that commitment in the UW Diversity Blueprint (http://www.washington.edu/diversity/diversity-blueprint/). Additionally, the University’s Faculty Code recognizes faculty efforts in research, teaching and/or service that address diversity and equal opportunity as important contributions to a faculty member’s academic profile and
responsibilities (https://www.washington.edu/admin/rules/policies/FCG/FCCH24.html#2432)

Karan Srivastava

Karan Srivastava headshotKaran Srivastava (Mathematics), advised by Jordan Ellenberg (Mathematics) is interested in applying machine learning techniques to generate interpretable, generalizable data for gaining a deeper understanding of problems in pure mathematics. Currently, he is working on generating examples in additive combinatorics and convex geometry using deep reinforcement learning. 

Joe Shenouda

Joe Shenouda headshotJoe Shenouda (Electrical and Computer Engineering) advised by Rob Nowak (ECE), Kangwook Lee (ECE) and Stephen Wright (Computer Sciences) is interested in developing a theoretical understanding of deep learning. Currently, he is working towards precisely characterising the benefits of depth in deep neural networks.

Alex Hayes

Alex Hayes headshotAlex Hayes (Statistics), advised by Keith Levin (Statistics) and Karl Rohe (Statistics), is interested in causal inference on networks. He is currently working on methods to estimate peer-to-peer contagion in noisily-observed networks.

Matthew Zurek

Matthew Zurek headshotMatthew Zurek (Computer Sciences), advised by Yudong Chen (Computer Sciences) and Simon Du (Computer Science and Engineering) is interested in reinforcement learning theory. He is currently working on algorithms with improved instance-dependent sample complexities.

Jitian Zhao

Jitian Zhao headshotJitian Zhao (Statistics), advised by Karl Rohe (Statistics) and Fred Sala (Computer Sciences) is interested in graph analysis and non-Euclidean models. She is currently working on community detection and interpretation under the setting of large directed anonymous network. 

Thanasis Pittas

Thanasis Pittas (Computer Science), advised by Ilias Diakonikolas (Computer Science) is working on robust statistics. His aim is to design efficient algorithms that can tolerate a constant fraction of the data being corrupted. The significance of computational efficiency arises from the high-dimensional nature of datasets in modern applications. To complete our theoretical understanding, it is also imperative to study the inherent trade-offs between computational efficiency and statistical performance of these algorithms.

William Powell

William Powell headshotWilliam Powell (Mathematics), advised by Hanbaek Lyu (Mathematics) and Qiaomin Xie (Industrial and Systems Engineering) is interested in stochastic optimization. His current work focuses on a variance reduced optimization algorithm in the context of dependent data sampling.

Wisconsin Funds 8 RAs for Fall 2023

The UW-Madison site of IFDS is funding several Research Assistants during Fall 2023 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Ziqian Lin

Ziqian Lin (Computer Science), advised by Kangwook Lee (Electrical and Computer Engineering) and Hanbaek Lyu (Mathematics), works on the intersection of machine learning and deep learning. Currently, he is working on model compositions of frozen pre-trained models and NLP topics including understanding the phenomena of in-context learning, and watermarking LLMs.

U Washington IFDS faculty Dmitriy Drusvyatskiy receives the SIAM Activity Group on Optimization Best Paper Prize

We are pleased to announce that Dmitriy Drusvyatskiy, Professor of Mathematics, is a co recipient of the SIAM Activity Group on Optimization Best Paper Prize, together with Damek Davis, for the paper "Stochastic model-based minimization of weakly convex functions." This prize is awarded every three years to the author(s) of the most outstanding paper, as determined by the prize committee, on a topic in optimization published in the four calendar years preceding the award year. Congratulations Dima!"

Contact

For more information, please contact the site director of your choice or email the webmaster who will try to direct your inquiry.

IFDS Welcomes Postdoctoral Affiliates for 2022-23

The ranks of IFDS Postdocs have grown remarkably in 2022-23 – an unprecedented 11 postdoctoral researchers are now affiliated with IFDS. Some have already been with us for a year, some have just joined. One (Lijun Ding) has moved from IFDS at Washington to IFDS at Wisconsin. 

We take this opportunity to introduce the current cohort of postdocs and share a little information about each.

Jeongyeol Kwon

Jeongyeol Kwon (jeongyeol.kwon@wisc.edu) joined UW-Madison in September 2022. He is doing a postdoc with Robert Nowak. He completed his PhD at The University of Texas at Austin in August 2022, advised by Constantine Caramanis. 

Jeongyeol is broadly interested in theoretical aspects of machine learning and optimization. During his Ph.D., Jeongyeol has focused on fundamental questions arising from statistical inference and sequential decision making in the presence of latent variables. In his earlier PhD years, he worked on the analysis of the Expectation-Maximization algorithm and showed its convergence and statistical optimality properties. More recently, he has been more involved in reinforcement learning theory with partial observations inspired by real-world examples. He plans to enlarge his scope to more diverse research topics including stochastic optimization and more practical approaches for RL and other related problems.

Sushrut Karmalkar

Sushrut Karmalkar joined UW-Madison as a postdoc in September 2021, mentored by Ilias Diakonikolas in the Department of Computer Sciences and supported by the 2021 CI Fellowship. He completed his Ph.D. at The University of Texas at Austin, advised by Prof. Adam Klivans.

During his Ph.D., Sushrut worked on various aspects of the theory of machine learning, including algorithmic robust statistics, theoretical guarantees for neural networks, and solving inverse problems via generative models. 

During the first year of his postdoc, he focused on understanding the problem of sparse mean estimation in the presence of various, extremely aggressive noise models. In this coming year, he plans to work on getting stronger lower bounds for these problems as well as work on getting improved guarantees for slightly weaker noise models.

Sushrut is on the job market this year (2022-2023)!

Ruhui Jin

Ruhui Jin (rjin@math.wisc.edu) joined UW-Madison in August 2022. She is a Van Vleck postdoc in the Department of Mathematics, hosted by Qin Li. She is partly supported by IFDS. She completed her PhD in mathematics from UT-Austin in 2022, advised by Rachel Ward. 

Her research is in mathematics of data science. Specifically, her thesis focused on dimensionality reduction for tensor-structured data. Currently, she is broadly interested in data-driven methods for learning complex systems. 

David Clancy

David Clancy headshotDavid Clancy (dclancy@math.wisc.edu) joined UW-Madison in August of 2022. He is doing a postdoc with Hanbaek Lyu and Sebastian Roch in Mathematics and is partly supported by IFDS. He completed his PhD in mathematics at the University of Washington in Seattle working under the supervision of Soumik Pal. During graduate school he worked on problems related to the metric-measure space structure of sparse random graphs with small surplus as their size grows large. He plans to investigate a wider range of topics related to the component structure of random graphs with an underlying community structure as well as inference problems on trees and graphs.

Ross Baczar

Ross BoczarRoss Boczar (www.rossboczar.com, rjboczar@uw.edu) is a Postdoctoral Scholar at the University of Washington, associated with the Department of Electrical and Computer Engineering, the Institute for Foundations of Data Science, and the eScience Institute. He is advised by Prof. Maryam Fazel and Prof. Lillian J. Ratliff. His research interests include control theory, statistical learning theory, optimization, and other areas of applied mathematics. Recently, he has been exploring adversarial learning scenarios where a group of colluding users can learn and then exploit a firm’s deployed ML classifier.

Stephen Mussmann

Stephen Mussmann (somussmann@gmail.com) is an IFDS postdoc (started September 2021) at the University of Washington working with Ludwig Schmidt and Kevin Jamieson. Steve received his Ph.D. in Computer Science from Stanford University, advised by Percy Liang.

Steve researches data selection for machine learning; including work in fields such as sub areas such as active learning, adaptive data collection, and data subset selection. During his Ph.D., Steve primarily focused on active learning: how to effectively select maximally informative data to collect or annotate. During his postdoc, Steve has worked on data pruning: choosing a small subset of a dataset such that models trained on that subset enjoy the same performance as models trained on the full dataset.

Steve is applying for academic research jobs this year (2022-2023)!

IFDS Workshop on Distributional Robustness in Data Science

A number of domain applications of data science, machine learning, mathematical optimization, and control have underscored the importance of the assumptions on the data generating mechanisms, changes, and biases. Distributional robustness has emerged as one promising framework to address some of these challenges. This topical workshop under the auspices of the Institute for Foundations of Data Science, an NSF TRIPODS institute, will survey the mathematical, statistical, and algorithmic foundations as well as recent advances at the frontiers of this research area. The workshop features invited talks as well as shorter talk by junior researchers and a social event to foster further discussions.

MORE INFO

REGISTER HERE

IFDS Workshop on Distributional Robustness in Data Science

August 4-6, 2022

University of Washington, Seattle, WA

A number of domain applications of data science, machine learning, mathematical optimization, and control have underscored the importance of the assumptions on the data generating mechanisms, changes, and biases. Distributional robustness has emerged as one promising framework to address some of these challenges. This topical workshop under the auspices of the Institute for Foundations of Data Science, an NSF TRIPODS institute, will survey the mathematical, statistical, and algorithmic foundations as well as recent advances at the frontiers of this research area. The workshop features invited talks as well as shorter talk by junior researchers and a social event to foster further discussions. 

Please see the workshop website for all the details and to register:

Central themes of the workshop include: 

  • Risk measures and distributional robustness for decision making
  • Distributional shifts in real-world domain applications 
  • Optimization algorithms for distributionally robust machine learning
  • Distributionally robust imitation learning and reinforcement learning
  • Learning theoretic statistical guarantees for distributional robustness

Ben Teo

Research interests in Statistical Phylogenetics. Working with Prof. Cecile Ane.

Nayoung Lee

Shubham Kumar Bharti

Shubham Kumar Bharti (Computer Science), advised by Jerry Zhu (Computer Science), and Kangwook Lee (Electrical and Computer Engineering), is interested in Reinforcement Learning, Fairness and Machine Teaching. His recent work focuses on fairness problems in sequential decision making. Currently, he is working on defenses against trojan attacks in Reinforcement Learning.

Shuyan Li

Shuyao Li (Computer Science), advised by Stephen Wright (Computer Science), Jelena Diakonikolas (Computer Science), and Ilias Diakonikolas (Computer Science), works on optimization and machine learning. Currently, he focuses on second order guarantee of stochastic optimization algorithms in robust learning settings.

Jiaxin Hu

Jiaxin Hu (Statistics), advised by Miaoyan Wang (Statistics) and Jerry Zhu (Computer Science), works on statistical machine learning. Currently, she focuses on tensor/matrix data modeling and analysis with applications in neuroscience and social networks.

Max Hill

Max Hill (Mathematics), advised by Sebastien Roch (Mathematics) and Cecile Ane (Statistics), works in probability and mathematical phylogenetics. His recent work focuses on the impact of recombination in phylogenetic tree estimation.

Sijia Fang

Sijia Fang (Statistics), advised by Karl Rohe (Statistics) and Sebastian Roch (Mathematics), works on social network and spectral analysis. More specifically, she is interested in hierarchical structures in social networks. She is also interested in phylogenetic tree and network recovery problems.

Zhiyan Ding

Zhiyan Ding (Mathematics), advised by Qin Li (Mathematics), works on applied and computational mathematics. More specifically, he uses PDE analysis tools for analyzing machine learning algorithms, such as Bayesian sampling and over-parameterized neural networks. The PDE tools, including gradient flow equation, and mean-field analysis, are helpful in formulating the machine (deep) learning algorithms into certain mathematical descriptions that are easier to handle.

Jasper Lee

Jasper Lee joined UW-Madison as a postdoc in August 2021, mentored by Ilias Diakonikolas in the Department of Computer Sciences and partly supported by IFDS. He completed his PhD at Brown University, working with Paul Valiant. His thesis work revisited and settled a basic problem in statistics: given samples from an unknown 1-dimensional probability distribution, what is the best way to estimate the mean of the distribution, with optimal finite-sample and high probability guarantees? Perhaps surprisingly, the conventional method of taking the average of samples is sub-optimal, and Jasper’s thesis work provided the first provably optimal “sub-Gaussian” 1-dimensional mean estimator under minimal assumptions. Jasper’s current research focuses on revisiting other foundational statistical problems and solving them also to optimality. He is additionally pursuing directions in the adjacent area of algorithmic robust statistics.

Julian Katz-Samuels

Julian’s research focuses on designing practical machine learning algorithms that adaptively collect data to accelerate learning. My recent research interests include active learning, multi-armed bandits, black-box optimization, and out-of-distribution detection using deep neural networks. I am also very interested in machine learning applications that promote the social good.

Greg Canal

Greg CanalGreg Canal (gcanal@wisc.edu) joined UW-Madison in September 2021. He is doing a postdoc with Rob Nowak in Electrical and Computer Engineering, partly supported by IFDS. He completed his PhD at Georgia Tech in Electrical and Computer Engineering in 2021. For his thesis, he developed and analyzed new active machine learning algorithms inspired by feedback coding theory. During his first year as a postdoc he worked on a new approach for multi-user recommender systems, and as he continues into his second year he plans on exploring new active learning algorithms for deep neural networks.

Ahmet Alacaoglu

Ahmet Alacaoglu (alacaoglu@wisc.edu) joined UW–Madison in September 2021. He is doing a postdoc with Stephen J. Wright, partly supported by IFDS. He completed his PhD at EPFL, Switzerland in Computer and Communication Sciences in 2021. For his thesis, he developed and analyzed randomized first-order primal-dual algorithms for convex minimization and min-max optimization. During his postdoc, he is working on nonconvex stochastic optimization and investigating the connections between optimization theory and reinforcement learning.

Ahmet is on the academic job market (2022-2023)!

Lijun Ding

Lijun Ding (lijunbrianding@gmail.com) joined the University of Wisconsin-Madison (UW-Madison) in September 2022 as an IFDS postdoc with Stephen J. Wright. Before joining IFDS at UW-Madison, He was an IFDS postdoc with Dmitry Drusvyatskiy, and Maryam Fazel at the University of Washington. He obtained his Ph.D. in Operations Research at Cornell University, advised by Yudong Chen and Madeleine Udell.  

Lijun’s research lies at the intersection of optimization, statistics, and data science. By exploring ideas and techniques in statistical learning theory, convex analysis, and projection-free optimization, he analyses and designs efficient and scalable algorithms for classical semidefinite programming and modern nonconvex statistical problems. He also studies the interplay between model overparametrization, algorithmic regularization, and model generalization via the lens of matrix factorization. He plans to explore a wider range of topics at this intersection during his appointment at UW-Madison.

He is on the 2022 – 23 job market!

Yiding Chen

Yiding Chen (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and collaborating with Po-ling Loh (Electrical and Computer Engineering), is doing research on adversarial machine learning and robust machine learning. In particular, he focuses on robust learning in high dimension. He is also interested in the intersection of adversarial machine learning and control theory.

Brandon Legried

Brandon Legried (Mathematics), advised by Sebastien Roch and working with Cecile Ane (Botany and Statistics), is working on mathematical statistics with applications to computational biology. He is studying evolutionary history, usually depicted with a tree. As an inference problem, it is interesting to view extant species information as observed data and reconstruct tree topologies or states of evolving features (such as a DNA sequence). We look to achieve results that extend across many model assumptions. The mathematical challenges involved lie at the intersection of probability and optimization.

Ankit Pensia

Ankit Pensia (Computer Science), advised by Po-Ling Loh (Statistics) and Varun Jog (Electrical and Computer Engineering) is working on projects in robust machine learning. His focus is on designing statistically and computationally efficient estimators that perform well even when the training data itself is corrupted. Traditional algorithms don’t perform well in the presence of noise: they are either slow or incur large error. As the data today is high-dimensional and usually corrupted, a better understanding of fast and robust algorithms would lead to better performance in scientific and practical applications.

Blake Mason

Blake Mason (Electrical and Computer Engineering), advised by Rob Nowak (Electrical and Computer Engineering) and Jordan Ellenberg (Mathematics), is investigating problems of learning from comparative data, such as ordinal comparisons and similarity/dissimilarity judgements. In particular, he is studying metric learning and clustering problems in this setting with applications to personalized education, active learning, and representation learning. Additionally, he studies approximate optimization techniques for extreme classification applications.

Yuchen Zeng

Yuchen Zeng (Computer Science), advised by Kangwook Lee (Electrical and Computer Engineering) and Stephen J. Wright (Computer Science), is interested in Trustworthy AI with a particular focus on fairness. Her recent work investigates training fair classifiers from decentralized data. Currently, she is working on developing a new fairness notion that considers dynamics.

Stephen Wright honored with a 2022 Hilldale Award

As director of the Institute for the Foundations of Data Science, Steve Wright is dedicated to the development and teaching of the complex computer algorithms that underlie much of modern society. The institute, which has been supported by more than $6 million in federal funds under Wright, brings together dozens of researchers across computer science, math, statistics and other departments to advance this interdisciplinary work. Read more…

Rodriguez elected as a 2022 ASA Fellow

We are pleased to announce that Abel Rodriguez, Professor of Statistics, has been elected as a Fellow of the American Statistical Association (ASA).

ASA Fellows are named for their outstanding contributions to statistical science, leadership, and advancement of the field. This year, awardees will be recognized at the upcoming Joint Statistical Meetings (JSM) during the ASA President’s Address and Awards Ceremony. Congratulations on this prestigious recognition!

Sebastien Roch named Fellow of Institute of Mathematical Statistics

Sebastien Roch has been named a Fellow of the Institute of Mathematical Statistics (IMS).  IMS is the main scientific society for probability and the mathematical end of statistics.  Among its various activities, IMS publishes the Annals series of flagship journals of probability.

PIMS-IFDS-NSF Summer School in Optimal Transport

The PIMS- IFDS- NSF Summer School on Optimal Transport is happening at the University of Washington, Seattle, from June 19 till July 1, 2022.Registration is now open at through the PIMS event webpagePlease note that the deadline to apply for funded accommodation for junior participants is Feb15.Our main speakers are world leaders in the mathematics of optimal transport and its various fields of application.

  • Alfred Galichon (NYU, USA)Inwon Kim (UCLA, USA)
  • Jan Maas (IST, Austria)
  • Felix Otto (Max Planck Institute, Germany)
  • Gabriel Peyré (ENS, France).
  • Geoffrey Schiebinger (UBC, Canada)

Details and abstracts can be found on the summer school webpage.

As part of this summer school, IFDS will host a Machine Learning (ML) Day, with a focus on ML applications of Optimal Transport, that will include talks and a panel discussion. Details will be announced here later.

PIMS-IFDS-NSF Summer School on Optimal Transport poster

Wisconsin Announces Fall ’21 and Spring ’22 RAs

The UW-Madison IFDS site is funding several Research Assistants during Fall 2021 and Spring 2022 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Jillian Fisher

Jillian Fisher (Statistics) works with Zaid Harchaoui (Statistics) and Yejin Choi (Allen School for CSE) on dataset biases and generative modeling. She is also interested in nonparametric and semiparametric statistical inference.

Yifang Chen

Yifang Chen (CSE) works is co-advised by Kevin Jamieson (CSE) and Simon Du (CSE). Yifang is an expert on algorithm design for active learning multi-armed bandits in the presence of adversarial corruptions. Recently she has been working with Kevin and Simon on sample efficient methods for multi-task learning. 

Guanghao Ye

Guanghao Ye (CSE) works with Yin Tat Lee (CSE) and Dmitriy Drusvyatskiy (Mathematics) on designing faster algorithms for non-smooth non-convex optimization. He is interested in convex optimization. He developed the first nearly linear time algorithm for linear programs with small treewidth.

Josh Cutler

Josh Cutler (Mathematics) works with Dmitriy Drusvyatskiy (Mathematics) and Zaid Harchaoui (Statistics) on stochastic optimization for data science. His most recent work has focused on designing and analyzing algorithms for stochastic optimization problems with data distributions that evolve in time. Such problems appear routinely in machine learning and signal processing. 

Romain Camilleri

Romain Camilleri (CSE) works with Kevin Jamieson (CSE), Maryam Fazel (ECE), and Lalit Jain (Foster School of Business). He is interested in robust high dimensional experimental design with adversarial arrivals. He also collaborates with IFDS student Zhihan Xiong (CSE).

Yue Sun

Yue Sun (ECE) works with Maryam Fazel (ECE) and Goiehran Mesbahi (Aeronautics), as well as collaborator Prof. Samet Oymak (UC Riverside). Yue is currently interested in two topics: (1) understanding the role of over-parametrization in meta-learning, and (2) gradient-based policy update methods for control, which connects the two fields of control theory and reinforcement learning.

Lang Liu

Lang Liu (Statistics) works with Zaid Harchaoui (Statistics) and Soumik Pal (Mathematics – Kantorovich Initiative) on statistical hypothesis testing and regularized optimal transport. He is also interested in statistical change detection and natural language processing. 

IFDS-MADLab Workshop

Statistical Approaches to Understanding Modern ML Methods

Aug 2-4, 2021
University of Wisconsin–Madison

When we use modern machine learning (ML) systems, the output often consists of a trained model with good performance on a test dataset. This satisfies some of our goals in performing data analysis, but leaves many unaddressed — for instance, we may want to build an understanding of the underlying phenomena, to provide uncertainty quantification about our conclusions, or to enforce constraints of safety, fairness, robustness, or privacy. As an example, classical statistical methods for quantifying a model’s variance rely on strong assumptions about the model — assumptions that can be difficult or impossible to verify for complex modern ML systems such as neural networks. 

This workshop will focus on using statistical methods to understand, characterize, and design ML models — for instance, methods that probe “black-box” ML models (with few to no assumptions) to assess their statistical properties, or tools for developing likelihood-free and simulation-based inference. Central themes of the workshop may include:

  • Using the output of a ML system to perform statistical inference, compute prediction intervals, or quantify measures of uncertainty

  • Using ML systems to test for conditional independence

  • Extracting interpretable information such as feature importance or causal relationships

  • Integrating likelihood-free inference with ML

  • Developing mechanisms for enforcing privacy, robustness, or stability constraints on the output of ML systems

  • Exploring connections to transfer learning and domain adaptation

  • Automated tuning of hyperparameters in black-box models and derivative-free optimization

Participants

Osbert Bastani
Osbert Bastani
PAC prediction sets under distribution shift
Avrim Blum
Avrim Blum
Recovering from biased data: Can fairness constraints improve accuracy
Kamalika Chaudhuri
Kamalika Chaudhuri
Statistical challenges in adversarially robust machine learning
Kyle Cranmer
Kyle Cranmer
Simulation-based inference: recent progress and open questions
Peng Ding
Peng Ding
Model-assisted analyses of cluster-randomized experiments
Dylan Foster
Dylan Foster
From predictions to decisions: A black-box approach to the contextual bandit problem
Zaid Harchaoui
Zaid Harchaoui
The statistical trade-offs of generative modeling with deep neural networks
Lucas Janson
Lucas Janson
Floodgate: Inference for variable importance with machine learning
Edward Kennedy
Edward Kennedy
Optimal doubly robust estimation of heterogeneous causal effects
Lihua Lei
Lihua Lei
Conformalized survival analysis
Sharon Li
Sharon Li
Uncovering the unknowns of deep neural networks: Challenges and opportunities
Po-Ling Loh
Po-Ling Loh
Robust W-GAN-based estimation under Wasserstein contamination
Aaditya Ramdas
Aaditya Ramdas
A quick tour of distribution-free post-hoc calibration
Hanie Sedghi
Hanie Sedghi
The deep bootstrap framework: Good online learners are good offline generalizers
Ryan Tibshirani
Ryan Tibshirani
Discrete splines: Another look at trend filtering and related problems
Vladimir Vovk
Vladimir Vovk
Conformal prediction, testing, and robustness
Yao Xie
Yao Xie
Conformal prediction intervals for dynamic time series

Schedule

Slides

All slides for the workshop are included in the gallery below. Click on a poster to see it in full screen.

IFDS 2021 Summer School

Monday, July 26 – Friday, July 30

Speakers and Topics

Cécile Ané
Cécile AnéStatistics & Botany, UW–Madison
Stochastic processes and algorithms for phylogenetics
Rina Foygel Barber
Rina Foygel BarberStatistics, U of Chicago
Lecture 1: Distribution-free inference: aims and algorithms
Lecture 2: Distribution-free inference: limits and open questions
Sebastien Bubeck
Sebastien BubeckMicrosoft Research
Lecture 1: Reminders on neural networks and high-dimensional probability
Lecture 2: A universal law of robustness via isoperimetry
Ilias Diakonikolas
Ilias DiakonikolasComputer Sciences, UW-Madison
Algorithmic Robust Statistics in High Dimensions
Kevin Jamieson
Kevin JamiesonComputer Science & Engineering, U Washington
Experimental Design for Linear Dynamical Systems
Dimitris Papailiopoulos,
Dimitris Papailiopoulos, Electrical & Computer Engineering, UW–Madison
A Summer Catalogue of Lottery Prizes: Finding Everything Within Random Nets
Jose Israel Rodriguez
Jose Israel RodriguezMathematics, UW–Madison
Sparse polynomial system solving and the method of moments
Karle Rohe
Karle RoheStatistics, UW–Madison
Interpretable embeddings with Varimax for graphs, text, and other things
+Lab session
Miaoyan Wang
Miaoyan WangStatistics, UW–Madison
Statistical foundations and algorithms for tensor learning

Schedule

(all times CDT)

Click speaker name for video

Monday, July 26th
Time Speaker
10:30-11:30 Karl Rohe
1:00-2:00 Karl Rohe
2:30-3:30 Jose Rodriguez
4:00-5:00 Jose Rodriguez
5:15-6:30 Poster Session
Tuesday, July 27th
Time Speaker
10:30-11:30 Cécile Ané
1:00-2:00 Cécile Ané
2:30-3:30 Miaoyan Wang
4:00-5:00 Miaoyan Wang
5:15-6:30 Poster Session
Wednesday, July 28th
Time Speaker
10:30-11:30 Dimitris Papailiopoulos
1:00-2:00 Dimitris Papailiopoulos
2:30-3:30 Sebastien Bubeck
4:00-5:00 Sebastien Bubeck
Thursday, July 29th
Time Speaker
10:30-11:30 Kevin Jamieson
1:00-2:00 Kevin Jamieson
2:30-3:30 Ilias Diakonikolas
4:00-5:00 Ilias Diakonikolas
Friday, July 30th
Time Speaker
10:30-11:30 Rina Foygel-Barber
1:00-2:00 Rina Foygel-Barber

AI4All @ Washington

The UW Instance of AI4ALL is an AI4ALL-sanctioned program run under the Taskar Center for Accessible Technology, an initiative by the UW Paul G. Allen School for Computer Science & Engineering. This year we were joined by faculty from IFDS at UW in teaching the summer cohort. More informations at <a href=”https://ai4all.cs.washington.edu/about-us/”>ai4all.cs.washington.edu</a>.

IFDS Summer School 2021: Registration Open!

The IFDS Summer School 2021, co-sponsored by MADLab, will take place on July 26-July 30, 2021. The summer school will introduce participants to a broad range of cutting-edge areas in modern data science with an eye towards fostering cross-disciplinary research. We have a fantastic lineup of lecturers:

* Cécile Ané (UW-Madison)
* Rina Foygel Barber (Chicago)
* Sebastien Bubeck (Microsoft Research)
* Ilias Diakonikolas (UW-Madison)
* Kevin Jamieson (Washington)
* Dimitris Papailiopoulos (UW-Madison)
* Jose Rodriguez (UW-Madison)
* Karle Rohe (UW-Madison)
* Miaoyan Wang (UW-Madison)

Registration for both virtual and in-person participation is now open. More details at: https://ifds.info/ifds-2021-summer-school/

Six Cluster Hires in Data Science at UW–Madison

IFDS is delighted to welcome six new faculty members at UW-Madison working in areas related to fundamental data science. All were hired as a result of a cluster hiring process proposed by the IFDS Phase 1 leadership at Wisconsin in 2017 and approved by UW-Madison leadership in 2019. Although the cluster hire was intended originally to hire only three faculty members in the TRIPODS areas of statistical, mathematical, and computational foundations of data science, three extra positions funded through other faculty lines were filled during the hiring process.

One of the new faculty members, Ramya Vinayak, joined the ECE Department in 2020. The other five will join in Fall 2021. They are:

Yudong Chen
Yudong Chen
Computer Science
Sameer Deshpande
Sameer Deshpande
Statistics
Yinqui He
Yinqui He
Statistics
Hanbaek Lyu
Hanbaek Lyu
Mathematics
Qiaomin Xie
Qiaomin Xie
ISyE
Ramya Vinayak
Ramya Vinayak
ECE

These faculty will bring new strengths to IFDS and UW-Madison. We’re excited that they are joining us, and look forward to working with them in the years ahead!

Research Highlights: Robustness meets Algorithms

IFDS Affiliate Ilias Diakonikolas and collaborators recently published a survey article titled “Robustness Meets Algorithms” in the Research Highlights section of the Communications of the ACM. The article presents a high-level description of groundbreaking work by the authors, which developed the first robust learning algorithms for high-dimensional unsupervised learning problems, including for example robustly estimating the mean and covariance of a high-dimensional Gaussian (first published in FOCS’16/SICOMP’19). This work resolved a long-standing open problem in statistics and at the same time was the beginning of a new area, now known as “algorithmic robust statistics”. Ilias is currently finishing a book on the topic with Daniel Kane to be published by Cambridge University Press.

More Info…

TRIPODS PI meeting

Fall 20 Wisconsin RAs

A gathering of leaders of the institutes funded under NSF’s TRIPODS program since 2017

Venue: Zoom and gather.town.

Thu 6/10/21: 11:30-2 EST and 3-5:30 EST (5 hours)  •  Fri 6/11/21: 11:30-2 EST and 3-5:30 EST (5 hours)

Format
  • One 15-minute “research highlight” talk from each of the current 15 HDR Phase 1s and two such talks from the 2 Phase 2s. (13 mins + 2 min questions each). [6 hours approx]
    Each talk focuses on one particular research topic. It’s NOT a survey talk or overview of TRIPODS research at your institute – that belongs in the poster session.
  • Panel session on “major challenges in fundamental data science” with 5 panelists – 2 from the Phase 2s and 3 from Phase 1. [1 hour]

  • Panel of experts from industry on “adoption of fundamental data science methodology in industry – trends and needs” [1 hour]

  • Poster session: one per institute, sketching the range of activities undertaken by that institution, in research / education / outreach. [1.25 hours]  
    Posters will be available for browsing in gather.town throughout the meeting.

Panel Nominations:

If you wish to be a panelist in the “major challenges” panel, please submit a 200-word abstract to the meeting co-chairs by 5/20/21. (If we receive more nominations than there are slots, a selection process involving all currently-funded TRIPODS institutes will be conducted.)

Panelists in the “industry” panel should be industry affiliates of a currently funded TRIPODS institute, or possibly members of an institute with a joint academic-industry appointment. Please mail your suggestions to the organizers, with a few words on what topics the nominee could cover.

Poster Submission:

Please submit your poster (one per institute) by

June 7, 2021.

Schedule 

(all times EDT)

Thursday 6/10/2021

Time (EDT) Topic Presenter
11:30-12:00 Introduction & NSF Presentation Margaret Martonosi (CISE), Dawn Tilbury (ENG), Sean Jones (MPS)
12:00-12:15 Talk: IFDS (I) Jelena Diakonikolas
12:15-12:30 Talk: FODSI (1) Stefanie Jegelka
12:30-12:45 Talk: IDEAL Ali Vakilian
12:45-1:00 Break
1:00-1:15 Talk: GDSC Aaron Wagner
1:15-1:30 Talk: Tufts Tripods Misha Kilmer
1:30-1:45 Talk: Rutgers – DATA-INSPIRE Konstantin Mischaikow
1:45-2:00 Talk: UMass Patrick Flaherty
2:00-3:00 Lunch
3:00-4:00 PANEL: Major Challenges in Fundamental Data Science
4:00-4:15 Break
4:15-5:30 POSTER SESSION gather.town

List of Institutes: https://nsf-tripods.org

Friday 6/11/2021

Time (EDT) Topic Presenter
11:30-11:45 Talk: JHU
11:45-12:00 Talk: FODSI (II) Nika Haghtalab
12:00-12:15 Talk: IFDS (II) Kevin Jamieson
12:15-12:30 Talk: PIFODS Hamed Hassani
12:30-12:45 Break
12:45-1:00 Talk: TETRAPODS UC-Davis Rishi Chaudhuri
1:00-1:15 Talk: UIC Anastasios Sidiropoulos
1:15-1:30 Talk: D4 Iowa State Pavan Aduri
1:30-1:45 Talk: UIUC Semih Cayci
1:45-2:00 Talk: Duke Sayan Mukherjee
2:00-3:00 Lunch
3:00-3:15 Talk: UT-Austin Sujay Sanghavi
3:15-3:30 Talk: FIDS Simon Foucart
3:30-3:45 Talk: FINPenn Alejandro Ribiero
3:45-4:45 PANEL: Industry Adoption of Fundamental Data Science

Rebecca Willett honored by SIAM

Rebecca Willett, IFDS PI at the University of Chicago, has been selected as a SIAM Fellow for 2021. She was recognized “For her contributions to mathematical foundations of machine learning, large-scale data science, and computational imaging.” https://www.siam.org/research-areas/detail/computational-science-and-numerical-analysis 
In addition, her achievements were spotlighted as part of SIAM’s Mathematics and Statistics Awareness Month: https://sinews.siam.org/Details-Page/honoring-dr-rebecca-willett

IFDS team analyzes COVID-19 spread in Dane and Milwaukee county

A UW-Madison IFDS team (Nan Chen, Jordan Ellenberg, Xiao Hou and Qin Li), together with Song Gao, Yuhao Kang and Jingmeng Rao (UW-Madison, Geography), Kaiping Chen (UW-Madison, Life Sciences Communication) and Jonathan Patz (Global Health Institute), recently studied the COVID-19 spreading pattern, and its correlation with business foot traffic, race and ethnicity and age structure of sub-regions within Dane and Milwaukee county. The results are published on Proceedings of the National Academy of Sciences of the United States of America. (https://www.pnas.org/content/118/24/e2020524118)

A human mobility flow-augmented stochastic SEIR model was developed. When the model was combined with data assimilation and machine learning techniques, the team reconstructed the historical growth trajectories of COVID-19 infection in both counties. The results reveal different types of spatial heterogeneities (e.g., varying peak infection timing in different subregions) even within a county, suggesting a regionalization-based policy (e.g, testing and vaccination resource allocation) is necessary to mitigate the spread of COVID-19, and to prepare for future epidemics.

Click illustration to expand

Data science workshops co-organized by IFDS members

Mary Silber

Po-Ling Loh

Jelena Diakonikolas

Rebecca Willett

Xiaoxia Wu

IFDS members have been active leaders in the data science community over the past year, organizing workshops with participants reflecting all core TRIPODS disciplines, covering a broad range of career stages, and advancing the participation of women. IFDS executive committee member Rebecca Willett co-organized a workshop on the Multifaceted Complexity of Machine Learning at the NSF Institute of Mathematical and Statistical Innovation, which featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas as speakers. With the support of the Institute for Advanced Study Women and Mathematics Ambassador program, IFDS postdoc Xiaoxia (Shirley) Wu organized the Women in Theoretical Machine Learning Symposium, which again featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas. Finally, IFDS faculty Rebecca Willett and Mary Silber co-organized the Graduate Research Opportunities for Women 2020 conference, which is aimed at female-identified undergraduate students who may be interested in pursuing a graduate degree in the mathematical sciences. The conference is open to undergraduates from U.S. colleges and universities, including international students. IFDS faculty members Rina Barber and Rebecca Willett were featured speakers.

Wisconsin funds 8 Summer RAs

For the first time, IFDS @ Wisconsin is sponsoring a Summer RA program in 2021. As with RAs sponsored under the usual Fall and Spring semester programs, Summer 2021 RAs will be working with members of IFDS at Wisconsin and elsewhere on fundamental data science research. They will also be participating and assisting with IFDS’s activities during the summer, including the Summer School and Workshop to be held in Madison in July and August 2021.

The Summer 2021 RAs are:

Xufeng Cai
Xufeng CaiComputer Sciences
Advised by Jelena Diakonikolas
Changhun Jo
Changhun JoMathematics
Advised by Kangwook Lee
Chanwoo Lee
Chanwoo LeeStatistics
Advised by Miaoyan Yang
Thanasis Pittas
Thanasis PittasComputer Sciences
Advised by Ilias Diakonikolas
Ben Teo
Ben TeoStatistics
Advised by Cecile Ane
Liu Yang
Liu YangComputer Sciences
Advised by Rob Nowak
Shuqi Yu
Shuqi YuMathematics
Advised by Sebastien Roch
Xuezhou Zhang
Xuezhou ZhangComputer Sciences
Advised by Jerry Zhu

IFDS team of RAs and Affiliates wins Best Student Paper Award from the American Statistical Association

IFDS Wisc RA Rungang Han, in joint work with IFDS RA Yuetian Luo and Affiliates Anru Zhang and Miaoyan Wang, has been awarded a Best Student Paper Award from the Statistical Learning and Data Science Section of the American Statistical Association for the paper “Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit.” Rungang will present the paper in August at the Joint Statistical Meetings (JSM) 2021: https://community.amstat.org/slds/awards/student-paper-award
This paper develops an efficient and statistically optimal two-stage algorithm for multi-way high-order clustering motivated by multi-tissue gene expression analysis and dynamic network analysis. High-order clustering aims to identify heterogeneous substructure in the multiway datasets that arise commonly in neuroimaging, genomics, and social network studies. The non-convex and discontinuous nature of the problem poses significant challenges in both statistics and computation. This work proposes a tensor block model and the computationally efficient methods, high-order Lloyd algorithm and high-order spectral clustering, for high-order clustering in the tensor block model. The convergence of the proposed procedure is established, and their method is proved to achieve exact clustering under some mild assumptions. This work also gives the complete characterization for the statistical-computational trade-off in high-order clustering based on three different signal-to-noise ratio regimes. The proposed procedures are evaluated via extensive experiments on both synthetic datasets and real data examples in the flight route network and online click-through prediction.

Data Science Day 2021

Uses and Abuses of Data in Higher Education
May 14, 2021
8:45 am – 3:00 pm
Virtual Event via Zoom with Opportunities for Engagement
Details and registration: https://citl.ucsc.edu/data-science-day-2021/

Washington adds 6 RAs for the Spring 2021 term

The University of Washington site of IFDS is funding new Research Assistants in the Spring 2021 term to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

Jillian Fisher

Jillian Fisher (Statistics) works with Zaid Harchaoui (Statistics) and Yejin Choi (Allen School for CSE) on dataset biases and generative modeling. She is also interested in nonparametric and semiparametric statistical inference.

Yifang Chen

Yifang Chen (CSE) works is co-advised by Kevin Jamieson (CSE) and Simon Du (CSE). Yifang is an expert on algorithm design for active learning multi-armed bandits in the presence of adversarial corruptions. Recently she has been working with Kevin and Simon on sample efficient methods for multi-task learning. 

Guanghao Ye

Guanghao Ye (CSE) works with Yin Tat Lee (CSE) and Dmitriy Drusvyatskiy (Mathematics) on designing faster algorithms for non-smooth non-convex optimization. He is interested in convex optimization. He developed the first nearly linear time algorithm for linear programs with small treewidth.

Josh Cutler

Josh Cutler (Mathematics) works with Dmitriy Drusvyatskiy (Mathematics) and Zaid Harchaoui (Statistics) on stochastic optimization for data science. His most recent work has focused on designing and analyzing algorithms for stochastic optimization problems with data distributions that evolve in time. Such problems appear routinely in machine learning and signal processing. 

Romain Camilleri

Romain Camilleri (CSE) works with Kevin Jamieson (CSE), Maryam Fazel (ECE), and Lalit Jain (Foster School of Business). He is interested in robust high dimensional experimental design with adversarial arrivals. He also collaborates with IFDS student Zhihan Xiong (CSE).

Yue Sun

Yue Sun (ECE) works with Maryam Fazel (ECE) and Goiehran Mesbahi (Aeronautics), as well as collaborator Prof. Samet Oymak (UC Riverside). Yue is currently interested in two topics: (1) understanding the role of over-parametrization in meta-learning, and (2) gradient-based policy update methods for control, which connects the two fields of control theory and reinforcement learning.

Wisconsin funds 7 RAs for the Spring 21 Semester

The UW-Madison site of IFDS is funding several Research Assistants during Spring semester 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Washington adds 6 RAs to Winter 2021 term

The University of Washington site of IFDS is funding six Research Assistants in Winter quarter 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Steve Wright and collaborators win NeurIPS Test of Time Award 2020

Stephen Wright and three colleagues were announced winners of the Test of Time Award at the 2020 Conference on Neural Information Processing Systems (NeurIPS). The award is for the paper judged most influential from NeurIPS 2009, 2010, and 2011.

Wright and his coauthors received the award for their 2011 paper “Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.” The paper proposed an alternative way to implement Stochastic Gradient Descent (SGD) without any locking of memory access, that “outperforms alternative schemes that use locking by an order of magnitude.” SGD is the algorithm that drives many machine learning systems.

See this [12-minute talk](https://youtu.be/c5T7600RLPc) by Chris Re about the paper.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Postdoctoral fellows at IFDS@UW-Madison

The IFDS at UW-Madison is seeking applicants for one or two postdoctoral positions. Successful candidates will conduct research with members of the IFDS, both at UW-Madison and its partner sites, and will be based at the Wisconsin Institute of Discovery at UW-Madison (https://wid.wisc.edu/). A unique benefit of this position is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Travel funds will be provided. The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in data science. Additional desirable qualities include the ability to work effectively both independently and in a team, good communication skills; and a record of interdisciplinary collaborations.
To apply: https://drive.google.com/file/d/1LgeZ7fZ1cg0icuohCGAWxZlz8T9Jj66M/view?usp=sharing

IFDS Postdoctoral Fellow Positions Now Available U Wash

The newly-founded NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. A unique benefit is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

To read more and to apply, please see http://ads-institute.uw.edu/IFDS/postdoc.htm

Cluster faculty positions at UW-Madison

The University of Wisconsin-Madison seeks to hire several faculty members with research and teaching interests in the foundations of data science. These positions are part of a campus-wide cluster initiative to expand and broaden expertise in the fundamentals of data science at UW-Madison. Three faculty positions will be hired in the cluster, which will build a world-class core of interdisciplinary strength in the mathematical, statistical, and algorithmic foundations of data science. This hiring initiative involves the Departments of Mathematics, Statistics, Computer Sciences, Electrical and Computer Engineering, Industrial and Systems Engineering, and the Wisconsin Institute for Discovery. It is anticipated that successful candidates will develop strong collaborations with existing data science research programs at UW-Madison — including IFDS — and with external research centers. To apply, follow the links below:
Mathematics: https://www.mathjobs.org/jobs/list/16747
Statistics: https://jobs.hr.wisc.edu/en-us/job/506886/assistant-professor-associate-professor-professor-in-statistics-cluster-hire
Engineering: https://jobs.hr.wisc.edu/en-us/job/506918/professor-cluster-hire

Wisconsin Adds Eight RA’s for IFDS

The UW-Madison site of IFDS is funding eight Research Assistants Fall semester to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Read about Fall 2020 RA’s at https://ifds.info/fall-20-wisconsin-ras/

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Yin Tat Lee earns Packard Fellowship

Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows.

Read the full article at: http://ads-institute.uw.edu/IFDS/news/2020/10/15/packard/

UW launches Institute for Foundations of Data Science

The University of Washington will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

https://www.washington.edu/news/2020/09/01/uw-launches-institute-for-foundations-of-data-science/

UChicago Joins Three Universities in Institute for Foundational Data Science

The Transdisciplinary Research In Principles Of Data Science (TRIPODS) program of the National Science Foundation supports this work by bridging these three disciplines for research and training. In its first round of Phase II funding, the TRIPODS program awarded $12.5 million to the Institute for Foundations of Data Science (IFDS), a four-university collaboration among the Universities of Washington, Wisconsin-Madison, California Santa Cruz, and Chicago. https://computerscience.uchicago.edu/news/article/tripods-ifds/

NSF Tripods Phase II Award

The University of Washington (UW) will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

The Institute for Foundations of Data Science (IFDS) is a collaboration between the UW and the Universities of Wisconsin-Madison, California Santa Cruz, and Chicago, with a mission to develop a principled approach to the analysis of ever-larger, more complex and potentially biased data sets that play an increasingly important role in industry, government and academia.

Support for the IFDS comes from a $12.5 million grant from the National Science Foundation and its Transdisciplinary Research in Principles of Data Science, or TRIPODS, program. Today, the NSF named IFDS as one of two institutes nationwide receiving the first TRIPODS Phase II awards. TRIPODS is tied to the NSF’s Harnessing the Data Revolution (HDR) program, which aims to accelerate discovery and innovation in data science algorithms, data cyberinfrastructure and education and workforce development.

The five-year funding plan for the IFDS Phase II includes support for new research projects, workshops, a partnership across the four research sites and students and postdoctoral scholars co-advised by faculty from different fields. Plans for education and outreach will draw on previous experience of IFDS members and leverage institutional resources at all four sites.

Maryam Fazel (PI) becomes first recipient of the Moorthy Family Inspiration Career Development Professorship

In recognition of her outstanding, innovative work as a researcher and educator, Maryam Fazel (IFDS PI) was recently named the inaugural recipient of the Moorthy Family Inspiration Career Development Professorship. This generous endowment was established in 2019 by Ganesh and Hema Moorthy for the purposes of recruiting, rewarding and retaining UW ECE faculty members who have demonstrated a significant amount of promise early on in their careers.

Read more

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.