IFDS Summer School 2021: Registration Open!

The IFDS Summer School 2021, co-sponsored by MADLab, will take place on July 26-July 30, 2021. The summer school will introduce participants to a broad range of cutting-edge areas in modern data science with an eye towards fostering cross-disciplinary research. We have a fantastic lineup of lecturers:

* Cécile Ané (UW-Madison)
* Rina Foygel Barber (Chicago)
* Sebastien Bubeck (Microsoft Research)
* Ilias Diakonikolas (UW-Madison)
* Kevin Jamieson (Washington)
* Dimitris Papailiopoulos (UW-Madison)
* Jose Rodriguez (UW-Madison)
* Karle Rohe (UW-Madison)
* Miaoyan Wang (UW-Madison)

Registration for both virtual and in-person participation is now open. More details at: https://ifds.info/ifds-2021-summer-school/

IFDS 2021 Summer School Registration

Six Cluster Hires in Data Science at UW–Madison

IFDS is delighted to welcome six new faculty members at UW-Madison working in areas related to fundamental data science. All were hired as a result of a cluster hiring process proposed by the IFDS Phase 1 leadership at Wisconsin in 2017 and approved by UW-Madison leadership in 2019. Although the cluster hire was intended originally to hire only three faculty members in the TRIPODS areas of statistical, mathematical, and computational foundations of data science, three extra positions funded through other faculty lines were filled during the hiring process.

One of the new faculty members, Ramya Vinayak, joined the ECE Department in 2020. The other five will join in Fall 2021. They are:

Yudong Chen
Yudong Chen
Computer Science
Sameer Deshpande
Sameer Deshpande
Statistics
Yinqui He
Yinqui He
Statistics
Hanbaek Lyu
Hanbaek Lyu
Mathematics
Qiaomin Xie
Qiaomin Xie
ISyE
Ramya Vinayak
Ramya Vinayak
ECE

These faculty will bring new strengths to IFDS and UW-Madison. We’re excited that they are joining us, and look forward to working with them in the years ahead!

Research Highlights: Robustness meets Algorithms

IFDS Affiliate Ilias Diakonikolas and collaborators recently published a survey article titled “Robustness Meets Algorithms” in the Research Highlights section of the Communications of the ACM. The article presents a high-level description of groundbreaking work by the authors, which developed the first robust learning algorithms for high-dimensional unsupervised learning problems, including for example robustly estimating the mean and covariance of a high-dimensional Gaussian (first published in FOCS’16/SICOMP’19). This work resolved a long-standing open problem in statistics and at the same time was the beginning of a new area, now known as “algorithmic robust statistics”. Ilias is currently finishing a book on the topic with Daniel Kane to be published by Cambridge University Press.

More Info…

Rebecca Willett honored by SIAM

Rebecca Willett, IFDS PI at the University of Chicago, has been selected as a SIAM Fellow for 2021. She was recognized “For her contributions to mathematical foundations of machine learning, large-scale data science, and computational imaging.” https://www.siam.org/research-areas/detail/computational-science-and-numerical-analysis 
In addition, her achievements were spotlighted as part of SIAM’s Mathematics and Statistics Awareness Month: https://sinews.siam.org/Details-Page/honoring-dr-rebecca-willett

IFDS team analyzes COVID-19 spread in Dane and Milwaukee county

A UW-Madison IFDS team (Nan Chen, Jordan Ellenberg, Xiao Hou and Qin Li), together with Song Gao, Yuhao Kang and Jingmeng Rao (UW-Madison, Geography), Kaiping Chen (UW-Madison, Life Sciences Communication) and Jonathan Patz (Global Health Institute), recently studied the COVID-19 spreading pattern, and its correlation with business foot traffic, race and ethnicity and age structure of sub-regions within Dane and Milwaukee county. The results are published on Proceedings of the National Academy of Sciences of the United States of America. (https://www.pnas.org/content/118/24/e2020524118)

A human mobility flow-augmented stochastic SEIR model was developed. When the model was combined with data assimilation and machine learning techniques, the team reconstructed the historical growth trajectories of COVID-19 infection in both counties. The results reveal different types of spatial heterogeneities (e.g., varying peak infection timing in different subregions) even within a county, suggesting a regionalization-based policy (e.g, testing and vaccination resource allocation) is necessary to mitigate the spread of COVID-19, and to prepare for future epidemics.

Click illustration to expand

Data science workshops co-organized by IFDS members

Mary Silber

Po-Ling Loh

Jelena Diakonikolas

Rebecca Willett

Xiaoxia Wu

IFDS members have been active leaders in the data science community over the past year, organizing workshops with participants reflecting all core TRIPODS disciplines, covering a broad range of career stages, and advancing the participation of women. IFDS executive committee member Rebecca Willett co-organized a workshop on the Multifaceted Complexity of Machine Learning at the NSF Institute of Mathematical and Statistical Innovation, which featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas as speakers. With the support of the Institute for Advanced Study Women and Mathematics Ambassador program, IFDS postdoc Xiaoxia (Shirley) Wu organized the Women in Theoretical Machine Learning Symposium, which again featured IFDS faculty Po-Ling Loh and Jelena Diakonikolas. Finally, IFDS faculty Rebecca Willett and Mary Silber co-organized the Graduate Research Opportunities for Women 2020 conference, which is aimed at female-identified undergraduate students who may be interested in pursuing a graduate degree in the mathematical sciences. The conference is open to undergraduates from U.S. colleges and universities, including international students. IFDS faculty members Rina Barber and Rebecca Willett were featured speakers.

Wisconsin funds 8 Summer RAs

For the first time, IFDS @ Wisconsin is sponsoring a Summer RA program in 2021. As with RAs sponsored under the usual Fall and Spring semester programs, Summer 2021 RAs will be working with members of IFDS at Wisconsin and elsewhere on fundamental data science research. They will also be participating and assisting with IFDS’s activities during the summer, including the Summer School and Workshop to be held in Madison in July and August 2021.

The Summer 2021 RAs are:

Xufeng Cai
Xufeng CaiComputer Sciences
Advised by Jelena Diakonikolas
Changhun Jo
Changhun JoMathematics
Advised by Kangwook Lee
Chanwoo Lee
Chanwoo LeeStatistics
Advised by Miaoyan Yang
Thanasis Pittas
Thanasis PittasComputer Sciences
Advised by Ilias Diakonikolas
Ben Teo
Ben TeoStatistics
Advised by Cecile Ane
Liu Yang
Liu YangComputer Sciences
Advised by Rob Nowak
Shuqi Yu
Shuqi YuMathematics
Advised by Sebastien Roch
Xuezhou Zhang
Xuezhou ZhangComputer Sciences
Advised by Jerry Zhu

IFDS team of RAs and Affiliates wins Best Student Paper Award from the American Statistical Association

IFDS Wisc RA Rungang Han, in joint work with IFDS RA Yuetian Luo and Affiliates Anru Zhang and Miaoyan Wang, has been awarded a Best Student Paper Award from the Statistical Learning and Data Science Section of the American Statistical Association for the paper “Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit.” Rungang will present the paper in August at the Joint Statistical Meetings (JSM) 2021: https://community.amstat.org/slds/awards/student-paper-award
This paper develops an efficient and statistically optimal two-stage algorithm for multi-way high-order clustering motivated by multi-tissue gene expression analysis and dynamic network analysis. High-order clustering aims to identify heterogeneous substructure in the multiway datasets that arise commonly in neuroimaging, genomics, and social network studies. The non-convex and discontinuous nature of the problem poses significant challenges in both statistics and computation. This work proposes a tensor block model and the computationally efficient methods, high-order Lloyd algorithm and high-order spectral clustering, for high-order clustering in the tensor block model. The convergence of the proposed procedure is established, and their method is proved to achieve exact clustering under some mild assumptions. This work also gives the complete characterization for the statistical-computational trade-off in high-order clustering based on three different signal-to-noise ratio regimes. The proposed procedures are evaluated via extensive experiments on both synthetic datasets and real data examples in the flight route network and online click-through prediction.

IFDS-MADLab Workshop

Statistical Approaches to Understanding Modern ML Methods

Aug 2-4, 2021
University of Wisconsin–Madison

When we use modern machine learning (ML) systems, the output often consists of a trained model with good performance on a test dataset. This satisfies some of our goals in performing data analysis, but leaves many unaddressed — for instance, we may want to build an understanding of the underlying phenomena, to provide uncertainty quantification about our conclusions, or to enforce constraints of safety, fairness, robustness, or privacy. As an example, classical statistical methods for quantifying a model’s variance rely on strong assumptions about the model — assumptions that can be difficult or impossible to verify for complex modern ML systems such as neural networks. 

This workshop will focus on using statistical methods to understand, characterize, and design ML models — for instance, methods that probe “black-box” ML models (with few to no assumptions) to assess their statistical properties, or tools for developing likelihood-free and simulation-based inference. Central themes of the workshop may include:

  • Using the output of a ML system to perform statistical inference, compute prediction intervals, or quantify measures of uncertainty

  • Using ML systems to test for conditional independence

  • Extracting interpretable information such as feature importance or causal relationships

  • Integrating likelihood-free inference with ML

  • Developing mechanisms for enforcing privacy, robustness, or stability constraints on the output of ML systems

  • Exploring connections to transfer learning and domain adaptation

  • Automated tuning of hyperparameters in black-box models and derivative-free optimization

Participants

Demba Ba
Demba Ba
Peng Ding
Peng Ding
Edward Kennedy
Edward Kennedy
Aaditya Ramdas
Aaditya Ramdas
Avrim Blum
Avrim Blum
Dylan Foster
Dylan Foster
Lihua Lei
Lihua Lei
Hanie Sedghi
Hanie Sedghi
Kamalika Chaudhuri
Kamalika Chaudhuri
Zaid Harchaoui
Zaid Harchaoui
Sharon Li
Sharon Li
Vladimir Vovk
Vladimir Vovk
Kyle Cranmer
Kyle Cranmer
Lucas Janson
Lucas Janson
Po-Ling Loh
Po-Ling Loh
Yao Xie
Yao Xie

TRIPODS PI meeting

TRIPODS PI meeting

A gathering of leaders of the institutes funded under NSF’s TRIPODS program since 2017

Venue: Zoom and gather.town.

Thu 6/10/21: 11:30-2 EST and 3-5:30 EST (5 hours)  •  Fri 6/11/21: 11:30-2 EST and 3-5:30 EST (5 hours)

Format
  • One 15-minute “research highlight” talk from each of the current 15 HDR Phase 1s and two such talks from the 2 Phase 2s. (13 mins + 2 min questions each). [6 hours approx]
    Each talk focuses on one particular research topic. It’s NOT a survey talk or overview of TRIPODS research at your institute – that belongs in the poster session.
  • Panel session on “major challenges in fundamental data science” with 5 panelists – 2 from the Phase 2s and 3 from Phase 1. [1 hour]

  • Panel of experts from industry on “adoption of fundamental data science methodology in industry – trends and needs” [1 hour]

  • Poster session: one per institute, sketching the range of activities undertaken by that institution, in research / education / outreach. [1.25 hours]  
    Posters will be available for browsing in gather.town throughout the meeting.

Panel Nominations:

If you wish to be a panelist in the “major challenges” panel, please submit a 200-word abstract to the meeting co-chairs by 5/20/21. (If we receive more nominations than there are slots, a selection process involving all currently-funded TRIPODS institutes will be conducted.)

Panelists in the “industry” panel should be industry affiliates of a currently funded TRIPODS institute, or possibly members of an institute with a joint academic-industry appointment. Please mail your suggestions to the organizers, with a few words on what topics the nominee could cover.

Poster Submission:

Please submit your poster (one per institute) by

June 7, 2021.

Submit Poster

Schedule 

(all times EDT)

Thursday 6/10/2021

Time (EDT) Topic Presenter
11:30-12:00 Introduction & NSF Presentation Margaret Martonosi (CISE), Dawn Tilbury (ENG), Sean Jones (MPS)
12:00-12:15 Talk: IFDS (I) Jelena Diakonikolas
12:15-12:30 Talk: FODSI (1) Stefanie Jegelka
12:30-12:45 Talk: IDEAL Ali Vakilian
12:45-1:00 Break
1:00-1:15 Talk: GDSC Aaron Wagner
1:15-1:30 Talk: Tufts Tripods Misha Kilmer
1:30-1:45 Talk: Rutgers – DATA-INSPIRE Konstantin Mischaikow
1:45-2:00 Talk: UMass Patrick Flaherty
2:00-3:00 Lunch
3:00-4:00 PANEL: Major Challenges in Fundamental Data Science
4:00-4:15 Break
4:15-5:30 POSTER SESSION gather.town

List of Institutes: https://nsf-tripods.org

Friday 6/11/2021

Time (EDT) Topic Presenter
11:30-11:45 Talk: JHU
11:45-12:00 Talk: FODSI (II) Nika Haghtalab
12:00-12:15 Talk: IFDS (II) Kevin Jamieson
12:15-12:30 Talk: PIFODS Hamed Hassani
12:30-12:45 Break
12:45-1:00 Talk: TETRAPODS UC-Davis Rishi Chaudhuri
1:00-1:15 Talk: UIC Anastasios Sidiropoulos
1:15-1:30 Talk: D4 Iowa State Pavan Aduri
1:30-1:45 Talk: UIUC Semih Cayci
1:45-2:00 Talk: Duke Sayan Mukherjee
2:00-3:00 Lunch
3:00-3:15 Talk: UT-Austin Sujay Sanghavi
3:15-3:30 Talk: FIDS Simon Foucart
3:30-3:45 Talk: FINPenn Alejandro Ribiero
3:45-4:45 PANEL: Industry Adoption of Fundamental Data Science

IFDS 2021 Summer School

Monday, July 26 – Friday, July 30

Register Here

Speakers and Topics

Cécile Ané
Cécile AnéStatistics & Botany, UW–Madison
Stochastic processes and algorithms for phylogenetics
Rina Foygel Barber
Rina Foygel BarberStatistics, U of Chicago
Lecture 1: Distribution-free inference: aims and algorithms
Lecture 2: Distribution-free inference: limits and open questions
Sebastien Bubeck
Sebastien BubeckMicrosoft Research
Lecture 1: Reminders on neural networks and high-dimensional probability
Lecture 2: A universal law of robustness via isoperimetry
Ilias Diakonikolas
Ilias DiakonikolasComputer Sciences, UW-Madison
Algorithmic Robust Statistics in High Dimensions
Kevin Jamieson
Kevin JamiesonComputer Science & Engineering, U Washington
TBA
Dimitris Papailiopoulos,
Dimitris Papailiopoulos, Electrical & Computer Engineering, UW–Madison
A Summer Catalogue of Lottery Prizes: Finding Everything Within Random Nets
Jose Israel Rodriguez
Jose Israel RodriguezMathematics, UW–Madison
Sparse polynomial system solving and the method of moments
Karle Rohe
Karle RoheStatistics, UW–Madison
Interpretable embeddings with Varimax for graphs, text, and other things
+Lab session
Miaoyan Wang
Miaoyan WangStatistics, UW–Madison
Statistical foundations and algorithms for tensor learning

Schedule

(all times CDT)

Monday, July 26th
Time Person
10:30-11:30 Karl Rohe
1:00-2:00 Karl Rohe
2:30-3:30 Jose Rodriguez
4:00-5:00 Jose Rodriguez
5:15-6:15 Poster Session
Tuesday, July 27th
Time Person
10:30-11:30 Cécile Ané
1:00-2:00 Cécile Ané
2:30-3:30 Miaoyan Wang
4:00-5:00 Miaoyan Wang
Wednesday, July 28th
Time Person
10:30-11:30 Dimitris Papailiopoulos   
1:00-2:00 Dimitris Papailiopoulos   
2:30-3:30 Sebastien Bubeck
4:00-5:00 Sebastien Bubeck
Thursday, July 29th
Time Person
10:30-11:30 Ilias Diakonikolas
1:00-2:00 Ilias Diakonikolas  
2:30-3:30 Kevin Jamieson
4:00-5:00 Kevin Jamieson
Friday, July 30th
Time Person
10:30-11:30 Rina Foygel-Barber
1:00-2:00 Rina Foygel-Barber
2:30-3:30 Closing Event

Wisconsin funds 7 RAs for the Spring 21 Semester

The UW-Madison site of IFDS is funding several Research Assistants during Spring semester 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

 

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Washington adds 6 RAs to Winter 2021 term

The University of Washington site of IFDS is funding six Research Assistants in Winter quarter 2021 to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, who are members of IFDS.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Kristof Glauninger

Kristof Glauninger (Statistics) works with Zaid Harchaoui (Statistics), Virginia E. Armbrust (Oceanography) and François Ribalet (Oceanography) on statistical modeling for marine ecology. He focuses on statistical inference questions arising from phytoplankton population modeling. He is also interested in optimal transport and machine learning.

Alec Greaves-Tunnell

Alec Greaves-Tunnell (Statistics) works with Zaid Harchaoui (Statistics), Ali Shojaie (Biostatistics), and Azadeh Yazdan (Bioengineering) on distributionally robust learning for brain science and engineering. He is also interested in sequence models and time series in general, with applications to language processing and music analysis.

Adhyyan Narang

Adhyyan Narang (ECE) works with Maryam Fazel (ECE) and Lilian Ratliff (ECE). So far, he has worked to provide theoretical answers to foundational questions in learning from data: such as generalization of overparameterized models and robustness to adversarial examples. More recently, he is interested in providing guarantees for optimization in uncertain online environments in the presence of other agents.

Swati Padmanabhan

Swati Padmanabhan (ECE) works with Yin Tat Lee (Computer Science and Engineering) and Maryam Fazel (ECE) on designing a faster algorithm for the optimal design problem. She is also interested in semidefinite programming in general. She has developed several semidefinite programming in different settings which are fastest known in their setting.

Omid Sadeghi

Omid Sadeghi (ECE) works with Maryam Fazel as well as Lillian Ratliff (ECE). He is interested in the design and analysis of online optimization algorithms with budget constraints, and with stochastic or adversarial inputs. His work includes online resource allocation with submodular utility functions.

Zhihan Xiong

Zhihan Xiong (Computer Science and Engineering) works with Maryam Fazel (Electrical and Computer Engineering) and Kevin Jamieson (Computer Science and Engineering) as well as
Lalit Jain (Business School). His current work addresses optimal experimental design in a streaming setting where the measurement budget is limited and the quality of measurements vary unpredictably over time.

Shi Chen

Shi Chen (Mathematics), advised by Qin Li (Mathematics) and Stephen J. Wright (Computer Science), works on problems in the interdisciplinary area of applied math and machine learning. He is interested in connecting machine learning to various mathematical physics problems, including the homogenization of PDEs and inverse problems. His recent work focuses on applying deep learning to PDE inverse problems.

Jeffrey Covington

Jeffrey Covington (Mathematics), advised by Nan Chen (Mathematics) and Sebastien Roch (Mathematics), works on data assimilation for model state estimation and prediction. His primary focus is on models with nonlinear and non-Gaussian features, which present problems for traditional data assimilation techniques. Currently he is working on developing techniques for Lagrangian data assimilation problems, which typically involve high-dimensionality and strong nonlinear interactions.

 

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Liu Yang

Liu Yang (Computer Sciences) advised by Robert Nowak, Dimitris Papailiopoulos and Kangwook Lee (Electrical and Computer Engineering), works on the intersection of machine learning and deep learning. Currently, she is working on the streaming model selection problem under limited memory resources.

Xuezhou Zhang

Xuezhou Zhang (Computer Sciences), advised by Jerry Zhu (Computer Sciences) and Kevin Jamieson (U of Washington), works on adaptivity and robustness in sequential decision making. His recent work focuses on designing reinforcement learning framework that learns from a diverse source of teaching signals and learns robustly in the presence of data corruptions.

Steve Wright and collaborators win NeurIPS Test of Time Award 2020

Stephen Wright and three colleagues were announced winners of the Test of Time Award at the 2020 Conference on Neural Information Processing Systems (NeurIPS). The award is for the paper judged most influential from NeurIPS 2009, 2010, and 2011.

Wright and his coauthors received the award for their 2011 paper “Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.” The paper proposed an alternative way to implement Stochastic Gradient Descent (SGD) without any locking of memory access, that “outperforms alternative schemes that use locking by an order of magnitude.” SGD is the algorithm that drives many machine learning systems.

See this [12-minute talk](https://youtu.be/c5T7600RLPc) by Chris Re about the paper.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Postdoctoral fellows at IFDS@UW-Madison

The IFDS at UW-Madison is seeking applicants for one or two postdoctoral positions. Successful candidates will conduct research with members of the IFDS, both at UW-Madison and its partner sites, and will be based at the Wisconsin Institute of Discovery at UW-Madison (https://wid.wisc.edu/). A unique benefit of this position is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Travel funds will be provided. The ideal candidate will have a PhD in computer science, statistics, mathematics, engineering or a related field, with expertise in data science. Additional desirable qualities include the ability to work effectively both independently and in a team, good communication skills; and a record of interdisciplinary collaborations.
To apply: https://drive.google.com/file/d/1LgeZ7fZ1cg0icuohCGAWxZlz8T9Jj66M/view?usp=sharing

IFDS Postdoctoral Fellow Positions Now Available U Wash

The newly-founded NSF Institute for Foundations of Data Science (IFDS) at the University of Washington (UW), Seattle, is seeking applications for one or more Postdoctoral Fellow positions. This is an ideal position for candidates interested in interdisciplinary research under the supervision of at least two faculty members of IFDS, which brings together researchers at the interface of mathematics, statistics, theoretical computer science, and electrical engineering. IFDS at UW operates in partnership with groups at the University of Wisconsin-Madison, the University of California at Santa Cruz, and the University of Chicago, and is supported by the NSF TRIPODS program. A unique benefit is the rich set of collaborative opportunities available, allowing truly interdisciplinary training. Initial appointment is for one year, with the possibility of renewal. Appropriate travel funds will be provided.

To read more and to apply, please see http://ads-institute.uw.edu/IFDS/postdoc.htm

Cluster faculty positions at UW-Madison

The University of Wisconsin-Madison seeks to hire several faculty members with research and teaching interests in the foundations of data science. These positions are part of a campus-wide cluster initiative to expand and broaden expertise in the fundamentals of data science at UW-Madison. Three faculty positions will be hired in the cluster, which will build a world-class core of interdisciplinary strength in the mathematical, statistical, and algorithmic foundations of data science. This hiring initiative involves the Departments of Mathematics, Statistics, Computer Sciences, Electrical and Computer Engineering, Industrial and Systems Engineering, and the Wisconsin Institute for Discovery. It is anticipated that successful candidates will develop strong collaborations with existing data science research programs at UW-Madison — including IFDS — and with external research centers. To apply, follow the links below:
Mathematics: https://www.mathjobs.org/jobs/list/16747
Statistics: https://jobs.hr.wisc.edu/en-us/job/506886/assistant-professor-associate-professor-professor-in-statistics-cluster-hire
Engineering: https://jobs.hr.wisc.edu/en-us/job/506918/professor-cluster-hire

Wisconsin Adds Eight RA’s for IFDS

The UW-Madison site of IFDS is funding eight Research Assistants Fall semester to collaborate across disciplines on IFDS research. Each one is advised by a primary and a secondary adviser, all of them members of IFDS.

Read about Fall 2020 RA’s at https://ifds.info/fall-20-wisconsin-ras/

Changhun Jo

Changhun Jo (Mathematics), advised by Kangwook Lee (Electrical and Computer Engineering) and Sebastien Roch (Mathematics), is working on the theoretical understanding of machine learning. His recent work focuses on finding an optimal data poisoning algorithm against a fairness-aware learner. He also works on finding the fundamental limit on sample complexity of matrix completion in the presence of graph side information.

Lorenzo Najt

Lorenzo Najt (Mathematics), advised by Jordan Ellenberg (Math), investigates algorithmic and inferential questions related to ensemble analysis of gerrymandering. Related to this, he is interested in sampling connected graph partitions, and is working with Michael Newton (Statistics) to investigate applications to hypothesis testing with spatial data. He is also working on some questions about the fixed parameter tractability of polytope algorithms.

Yuetian Luo

Yuetian Luo is working with Anru Zhang (Statistics) and Yingyu Liang (Computer Science) on non-convex optimization problems involving low rank matrices and implicit regularization for non-convex optimization methods. We develop novel fast algorithms with provable faster convergence guarantees than common first order methods. The algorithm is based on a new sketching scheme we developed for high dimensional problems. Also we are interested in investigating implicit regularization for different non-convex optimization methods and different initialization schemes.

Yingda Li

Yingda Li (Mathematics) is working with Nan Chen (Mathematics) and Sebastien Roch (Mathematics) on uncertainty quantification and data assimilation. She aims to develop statistically accurate algorithms for solving high-dimensional nonlinear and non-Gaussian dynamical systems. She also works on predicting complex nonlinear turbulent dynamical systems with imperfect models and insufficient training data.

Mehmet Furkan Demirel

Mehmet Furkan Demirel (Computer Sciences), advised by Yingyu Liang (Computer Sciences) and Dimitris Papailiopoulos (Electrical and Computer Engineering), focuses on molecule property prediction and construction of proper representations of molecules for learning algorithms. He also works on representation learning for graph-structured data and graph neural networks.

Cora Allen-Savietta

Cora Allen-Savietta (Statistics) works with Cécile Ané (Statistics, Botany) and Sebastien Roch (Mathematics) to develop efficient methods to infer evolutionary history and identify ancient hybridizations. As the challenge of this work is to reconstruct ancestry using only modern data, she explores the identifiability of a network’s topology from sequence data only at the leaves. She is also expanding tree rearrangements and existing search strategies to explore the large and discrete space of semi-directed networks. Her work has applications to understanding the history of eukaryotes, viruses, languages, and more.

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.

Yin Tat Lee earns Packard Fellowship

Each year, the David and Lucile Packard Foundation bestows this prestigious recognition upon a small number of early-career scientists and engineers who are at the leading edge of their respective disciplines. Lee is among just 20 researchers nationwide — and one of only two in the Computer & Information Sciences category — to be chosen as members of the 2020 class of fellows.

Read the full article at: http://ads-institute.uw.edu/IFDS/news/2020/10/15/packard/

UW launches Institute for Foundations of Data Science

The University of Washington will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

https://www.washington.edu/news/2020/09/01/uw-launches-institute-for-foundations-of-data-science/

UChicago Joins Three Universities in Institute for Foundational Data Science

The Transdisciplinary Research In Principles Of Data Science (TRIPODS) program of the National Science Foundation supports this work by bridging these three disciplines for research and training. In its first round of Phase II funding, the TRIPODS program awarded $12.5 million to the Institute for Foundations of Data Science (IFDS), a four-university collaboration among the Universities of Washington, Wisconsin-Madison, California Santa Cruz, and Chicago. https://computerscience.uchicago.edu/news/article/tripods-ifds/

NSF Tripods Phase II Award

The University of Washington (UW) will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

The Institute for Foundations of Data Science (IFDS) is a collaboration between the UW and the Universities of Wisconsin-Madison, California Santa Cruz, and Chicago, with a mission to develop a principled approach to the analysis of ever-larger, more complex and potentially biased data sets that play an increasingly important role in industry, government and academia.

Support for the IFDS comes from a $12.5 million grant from the National Science Foundation and its Transdisciplinary Research in Principles of Data Science, or TRIPODS, program. Today, the NSF named IFDS as one of two institutes nationwide receiving the first TRIPODS Phase II awards. TRIPODS is tied to the NSF’s Harnessing the Data Revolution (HDR) program, which aims to accelerate discovery and innovation in data science algorithms, data cyberinfrastructure and education and workforce development.

The five-year funding plan for the IFDS Phase II includes support for new research projects, workshops, a partnership across the four research sites and students and postdoctoral scholars co-advised by faculty from different fields. Plans for education and outreach will draw on previous experience of IFDS members and leverage institutional resources at all four sites.

Maryam Fazel (PI) becomes first recipient of the Moorthy Family Inspiration Career Development Professorship

In recognition of her outstanding, innovative work as a researcher and educator, Maryam Fazel (IFDS PI) was recently named the inaugural recipient of the Moorthy Family Inspiration Career Development Professorship. This generous endowment was established in 2019 by Ganesh and Hema Moorthy for the purposes of recruiting, rewarding and retaining UW ECE faculty members who have demonstrated a significant amount of promise early on in their careers.

Read more

Shashank Rajput

Shashank Rajput (Computer Science), advised by Dimitris Papailiopoulos (Electrical and Computer Engineering), Kangwook Lee (Electrical and Computer Engineering) and Stephen Wright (Computer Science), works on problems in distributed machine learning and optimization. He works on developing techniques which are fast and scalable for practical use, and have theoretical guarantees. Currently, he is working on finding better shuffling mechanisms that beat random reshuffling as well as developing algorithms for training neural networks by only pruning them.

Shuqi Yu

Shuqi Yu (Mathematics), advised by Sebastien Roch (Mathematics) and working with Karl Rohe (Statistics) on large scale network models. She aims to establish theoretical guarantees for a new estimator of the number of communities in a stochastic blockmodel. She is also interested in phylogenetics questions, in particular, she works on the identifiability of the species phylogeny under an horizontal gene transfer model.