Mikhail Yurochkin




I am a research manager at the MIT-IBM Watson AI Lab where I lead the Statistical Large Language Modeling group. I am interested in a variety of LLM-related problems (evaluation, alignment, routing, fusion, prompt engineering, UQ, data quality) and I like to explore statistical modeling approaches to them. I have also worked on OOD generalization, Algorithmic Fairness, Optimal Transport, Federated Learning, and Bayesian nonparametrics.

Before joining IBM, I completed my PhD in Statistics at the University of Michigan, where I worked with Long Nguyen. I received my Bachelor's degree in applied mathematics and physics from Moscow Institute of Physics and Technology.

News

Publications


2024

tinyBenchmarks: evaluating LLMs with fewer examples

Felipe Maia Polo, Lucas Weber, Leshem Choshen, Yuekai Sun, Gongjun Xu, Mikhail Yurochkin
ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, 2024
International Conference on Machine Learning (ICML), 2024
arXiv / Code / Data / Twitter

Asymmetry in Low-Rank Adapters of Foundation Models

Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon
ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, 2024
International Conference on Machine Learning (ICML), 2024
arXiv / Code

Risk Assessment and Statistical Significance in the Age of Foundation Models

Apoorva Nitsure, Youssef Mroueh, Mattia Rigotti, Kristjan Greenewald, Brian Belgodere, Mikhail Yurochkin, Jiri Navratil, Igor Melnyk, Jerret Ross
NeurIPS Workshop on Socially Responsible Language Modelling Research (SoLaR), 2023
International Conference on Machine Learning (ICML), 2024
arXiv / Code (see supplementary material)

Fusing Models with Complementary Expertise

Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin
NeurIPS Workshop on Distribution Shifts (DistShift), 2023
International Conference on Learning Representations (ICLR), 2024
arXiv

Aligners: Decoupling LLMs and Alignment

Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin
Tiny Papers at the International Conference on Learning Representations (ICLR), 2024 (Notable)
arXiv

An Investigation of Representation and Allocation Harms in Contrastive Learning

Subha Maity, Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun
International Conference on Learning Representations (ICLR), 2024
arXiv / Code

Uncertainty Quantification via Stable Distribution Propagation

Felix Petersen, Aashwin Ananda Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin
International Conference on Learning Representations (ICLR), 2024
arXiv / Code


2023

Large Language Model Routing with Benchmark Datasets

Tal Shnitzer, Anthony Ou, Mírian Silva, Kate Soule, Yuekai Sun, Justin Solomon, Neil Thompson, Mikhail Yurochkin
NeurIPS Workshop on Distribution Shifts (DistShift), 2023 (Oral)
arXiv / Code (see supplementary material)

Outlier-Robust Group Inference via Gradient Space Clustering

Yuchen Zeng, Kristjan Greenewald, Luann Jung, Kangwook Lee, Justin Solomon, Mikhail Yurochkin
NeurIPS Workshop on Distribution Shifts (DistShift), 2023
arXiv / Code

Rewiring with Positional Encodings for Graph Neural Networks

Rickard Gabrielsson, Mikhail Yurochkin, Justin Solomon
Transactions on Machine Learning Research (TMLR), 2023
arXiv

Simple Disentanglement of Style and Content in Visual Representations

Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin
International Conference on Machine Learning (ICML), 2023
arXiv / Code

k-Mixup Regularization for Deep Learning via Optimal Transport

Kristjan Greenewald, Anming Gu, Mikhail Yurochkin, Justin Solomon, Edward Chien
Transactions on Machine Learning Research (TMLR), 2023
arXiv / Code

Understanding new tasks through the lens of training data via exponential tilting

Subha Maity, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun
International Conference on Learning Representations (ICLR), 2023
arXiv / Code

Learning Proximal Operators to Discover Multiple Optima

Lingxiao Li, Noam Aigerman, Vladimir Kim, Jiajin Li, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon
International Conference on Learning Representations (ICLR), 2023
arXiv / Code

Sampling with Mollified Interaction Energy Descent

Lingxiao Li, Qiang Liu, Anna Korba, Mikhail Yurochkin, Justin Solomon
International Conference on Learning Representations (ICLR), 2023
arXiv / Code

Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness

Zahra Ashktorab, Benjamin Hoover, Mayank Agarwal, Casey Dugan, Werner Geyer, Hao Bang Yang, Mikhail Yurochkin
CHI Conference on Human Factors in Computing Systems, 2023
arXiv


2022

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees

Songkai Xue, Yuekai Sun, Mikhail Yurochkin
Neural Information Processing Systems (NeurIPS), 2022 (Oral)
arXiv

Domain Adaptation meets Individual Fairness. And they get along

Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, Yuekai Sun
Neural Information Processing Systems (NeurIPS), 2022
arXiv

Communication-Efficient Model Fusion

Mikhail Yurochkin and Yuekai Sun
Chapter 7 of Federated Learning: A Comprehensive Overview of Methods and Applications (edited by Heiko Ludwig and Nathalie Baracaldo), 2022
PDF

Personalization in Federated Learning

Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun
Chapter 4 of Federated Learning: A Comprehensive Overview of Methods and Applications (edited by Heiko Ludwig and Nathalie Baracaldo), 2022
PDF

Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets

Tal Shnitzer, Mikhail Yurochkin, Kristjan Greenewald, Justin Solomon
International Conference on Machine Learning (ICML), 2022
arXiv / Code

Your fairness may vary: Pretrained language model fairness in toxic text classification

Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Mikhail Yurochkin, Moninder Singh
Findings of ACL, 2022
arXiv

Measuring the sensitivity of Gaussian processes to kernel choice

William Stephenson, Soumya Ghosh, Tin Nguyen, Mikhail Yurochkin, Sameer Deshpande, Tamara Broderick
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
arXiv / Code


2021

On sensitivity of meta-learning to support data

Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun
Neural Information Processing Systems (NeurIPS), 2021
arXiv

Post-processing for Individual Fairness

Felix Petersen, Debarghya Mukherjee, Yuekai Sun, Mikhail Yurochkin
Neural Information Processing Systems (NeurIPS), 2021
arXiv / Code / Video

Does enforcing fairness mitigate biases caused by subpopulation shift?

Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, Yuekai Sun
Neural Information Processing Systems (NeurIPS), 2021
arXiv

On Efficient Multilevel Clustering via Wasserstein Distances

Viet Huynh, Nhat Ho, Nhan Dam, XuanLong Nguyen, Mikhail Yurochkin, Hung Bui, Dinh Phung
Journal of Machine Learning Research (JMLR), 2021
PDF / Code

Outlier-Robust Optimal Transport

Debarghya Mukherjee, Aritra Guha, Justin Solomon, Yuekai Sun, Mikhail Yurochkin
International Conference on Machine Learning (ICML), 2021
arXiv / Code

SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness

Mikhail Yurochkin and Yuekai Sun
International Conference on Learning Representations (ICLR), 2021 (Oral)
arXiv / Code (see supplementary material) / Video / Blog

Individually Fair Rankings

Amanda Bower, Hamid Eftekhari, Mikhail Yurochkin, Yuekai Sun
International Conference on Learning Representations (ICLR), 2021
arXiv / Code (see supplementary material) / Video

Statistical inference for individual fairness

Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun
International Conference on Learning Representations (ICLR), 2021
arXiv / Code / Video

Individually Fair Gradient Boosting

Alexander Vargo, Fan Zhang, Mikhail Yurochkin, Yuekai Sun
International Conference on Learning Representations (ICLR), 2021 (Spotlight)
arXiv / Code (see supplementary material) / Video


2020

Continuous Regularized Wasserstein Barycenters

Lingxiao Li, Aude Genevay, Mikhail Yurochkin, Justin Solomon
Neural Information Processing Systems (NeurIPS), 2020
arXiv / Code

Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination

Mark Weber, Mikhail Yurochkin, Sherif Botros, Vanio Markov
NeurIPS Fair AI in Finance Workshop, 2020 (Spotlight Talk)
arXiv / Blog

Model Fusion with Kullback–Leibler Divergence

Sebastian Claici, Mikhail Yurochkin, Soumya Ghosh, Justin Solomon
International Conference on Machine Learning (ICML), 2020
arXiv / Code

Two Simple Ways to Learn Individual Fairness Metric from Data

Debarghya Mukherjee, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun
International Conference on Machine Learning (ICML), 2020
arXiv / Code

Auditing ML models for individual bias and unfairness

Songkai Xue, Mikhail Yurochkin, Yuekai Sun
International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
arXiv

Federated Learning with Matched Averaging

Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, Yasaman Khazaeni
International Conference on Learning Representations (ICLR), 2020 (Oral)
arXiv / Code / Video / Blog

Training individually fair ML models with sensitive subspace robustness

Mikhail Yurochkin, Amanda Bower, Yuekai Sun
International Conference on Learning Representations (ICLR), 2020 (Spotlight)
arXiv / Code / Video / Blog


2019

Hierarchical Optimal Transport for Document Representation

Mikhail Yurochkin, Sebastian Claici, Edward Chien, Farzaneh Mirzazadeh, Justin Solomon
Neural Information Processing Systems (NeurIPS), 2019
arXiv / Code / Blog / MIT News

Alleviating Label Switching with Optimal Transport

Pierre Monteiller, Sebastian Claici, Edward Chien, Farzaneh Mirzazadeh, Justin Solomon, Mikhail Yurochkin
Neural Information Processing Systems (NeurIPS), 2019
arXiv / Code / Blog

Statistical Model Aggregation via Parameter Matching

Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang
Neural Information Processing Systems (NeurIPS), 2019
arXiv / Code / Blog

Scalable inference of topic evolution via models for latent geometric structures

Mikhail Yurochkin, Zhiwei Fan, Aritra Guha, Paraschos Koutris, XuanLong Nguyen
Neural Information Processing Systems (NeurIPS), 2019
arXiv / Code / Blog

Dirichlet Simplex Nest and Geometric Inference

Mikhail Yurochkin, Aritra Guha, Yuekai Sun, XuanLong Nguyen
International Conference on Machine Learning (ICML), 2019 (Long Talk)
arXiv / Code / Video

Bayesian Nonparametric Federated Learning of Neural Networks

Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang, Yasaman Khazaeni
International Conference on Machine Learning (ICML), 2019
arXiv / Code / Video

Online Semi-Supervised Learning with Bandit Feedback

Mikhail Yurochkin, Sohini Upadhyay, Djallel Bouneffouf, Mayank Agarwal, Yasaman Khazaeni
ICLR Limited Labeled Data (LLD) Workshop, 2019
PDF


2016-2018

Geometric Inference in Bayesian Hierarchical Models with Applications to Topic Modeling

Mikhail Yurochkin
PhD Thesis, University of Michigan, 2018
PDF / Slides

Conic Scan-and-Cover algorithms for nonparametric topic modeling

Mikhail Yurochkin, Aritra Guha, XuanLong Nguyen
Neural Information Processing Systems (NeurIPS), 2017
arXiv / Code

Multi-way Interacting Regression via Factorization Machines

Mikhail Yurochkin, XuanLong Nguyen, Nikolaos Vasiloglou
Neural Information Processing Systems (NeurIPS), 2017
arXiv / Code

Multilevel Clustering via Wasserstein Means

Nhat Ho, XuanLong Nguyen, Mikhail Yurochkin, Hung Hai Bui, Viet Huynh, Dinh Phung
International Conference on Machine Learning (ICML), 2017
arXiv / Code

Geometric Dirichlet Means algorithm for topic inference

Mikhail Yurochkin and XuanLong Nguyen
Neural Information Processing Systems (NeurIPS), 2016
arXiv / Code