Scroll horizontally for more information
Cécile AnéPhyloNetworksInference and manipulation of phylogenetic networks, and their use for trait evolutionJulia
Cécile AnéPhyloPlotsphylogenetic network visualization
Cécile AnéQuartetNetworkGoodnessFitphylogenetic networks analyses using four-taxon subsets
Cécile AnéPhyloCoalSimulationssimulate phylogenies under the coalescent
Cécile AnéRLinearAlgebraA platform for developing, comparing, and benchmarking randomized and deterministic solvers for linear systems and least squares problems.
Vivak Patel, Daniel Adrian MaldonadoRLinearAlgebra.jlDeploy randomized linear solversJulia
Sameer DeshpandeflexBARTA faster and more flexible implementation of Bayesian Additive Regression TreesR, C++
Owen Melia, Eric Jonas, Rebecca WillettRotation Invariant Random FeaturesCode for "Rotation-Invariant Random Features Provide a Strong Baseline
for Machine Learning on 3D Point Clouds"
Suzanna Parkinson, Greg Ongie, Rebecca WillettLinear layers in neural networksCode for
Elena Orlova, Aleksei Ustimenko, Ruoxi Jiang, Peter Y. Lu, Rebecca WillettDeep Stochastic MechanicsCode for
Yuming Chen, Daniel Sanz-Alonso, Rebecca WillettReduced-Order Autodifferentiable Ensemble Kalman Filters (ROAD-EnKF)Code for
Yuming Chen, Daniel Sanz-Alonso, Rebecca WillettAuto-differentiable Ensemble Kalman Filters (AD-EnKF)Code for
Elena Orlova, Haokun Liu, Raphael Rossellini, Benjamin Cash, Rebecca WillettBeyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal ForecastingCode for
Ruoxi Jiang, Rebecca WillettEmbed and Emulate: Learning to estimate parameters of dynamical systems with uncertainty quantificationCode for
Joseph Shenouda, Rahul Parhi, Kangwook Lee, Robert D. NowakVector-Valued Variation Spaces and Width Bounds for DNNs: Insights on Weight Decay RegularizationCode for
Karan SrivastavaGenerating large isosceles-free lattice subsets with reinforcement learningCode for training models and repository maintained for researchPython
Karl RohelongpcaThis package introduces a novel formula syntax for PCA. In modern applications (where data is often in "long format"), the formula syntax helps to fluidly imagine PCA without thinking about matrices. In other words, it provides a layer of abstraction above matrices.
Karl RohegdimThis package estimates graph dimension using cross-validated eigenvalues, via the graph-splitting technique. Theoretically, the method works by computing a special type of cross-validated eigenvalue which follows a simple central limit theorem. This allows users to perform hypothesis tests on the rank of the graph.
Nicholas Henderson/Michael NewtonrvaluesA collection of functions for computing "r-values" from various kinds of user input such as MCMC output or a list of effect size estimates and associated standard errors
Zihao Zheng/Michael A. NewtonMixTwiceImplements large-scale hypothesis testing by variance mixing. It takes two statistics per testing unit, an estimated effect and its associated squared standard error, and fits a nonparametric, shape-constrained mixture separately on two latent parameters


Ellenberg, Jordan, Shape: The Hidden Geometry of Information, Biology, Strategy, Democracy, and Everything Else, Penguin Books, 2022.

Wright, Stephen J., and Benjamin Recht, Optimization for Data Analysis, Cambridge University Press, 2022.

Diakonikolas, Ilias, and Daniel M. Kane, Algorithmic High-Dimensional Robust Statistics, Cambridge University Press, 2023.

Nan Chen, Stochastic Methods for Modeling and Predicting Complex Dynamical Systems, Springer Cham, 2024.

Roch, Sebastien, Modern Discrete Probability: An Essential Toolkit, Cambridge University Press, 2024.

Educational Materials & Tools

Online textbook on “Mathematical Methods in Data Science (with Python)” by Sebastien Roch.

Online tutorial on “Comparative methods on reticulate phylogenies”.

Online textbook on “Causal Inference” by Amy Cochran:

Lecture notes for a year-long course on the “Mathematics of Data Science” by Dmitriy Drusvyatskiy:

Lecture notes and video lectures for a course on the “Mathematical Foundations of Machine Learning” by Rebecca Willett:

Autograder: a server to automatically grading coding assignments.

Quiz Generator: allows a general format for quiz banks, that then can be uploaded in a variety of forms (gradescope, Canvas, pdf, html, qti, etc.); support for latex, wide variety of question types; and allows support for collaboration by using standard tools like git for managing questions.

Canvas Tools: a suite of tools and Python interface for Instructure’s Canvas LMS.