Mikhail Yurochkin

I am a Research Staff Member at IBM Research in Cambridge, Massachusetts. My research interests are

  • Model fusion and federated learning
  • Algorithmic fairness
  • Applications of optimal transport in machine learning
  • Bayesian (nonparametric) modeling and inference

Before joining IBM, I have completed PhD in Statistics at the University of Michigan, advised by Prof. Long Nguyen. I received my bachelor degree in applied mathematics and physics from Moscow Institute of Physics and Technology.


Publications list is outdated and will be updated soon.
Please see my Google scholar page.

Recasting the inference of topics as a geometric learning a simplex type problem allowed for two new algorithms for topic estimation, both are much faster and as accurate as Gibbs sampler, additionally second algorithm can estimate number of topics, i.e. vertices in the simplex. First algorithm consists of k-means clustering step with a computationally cheap geometric post processing of inferred cluster centroids. This work was presented at NIPS 2016.

Geometric Dirichlet Means algorithm for topic inference

Yurochkin M. & Nguyen X. [Link, PDF (arXiv), Code]

Second algorithm is joint work with Aritra Guha. By defining a cone hanging at the center of the data and scanning the space of documents using this cone we can both find the topics and estimate their number very accurately. This work was presented at NIPS 2017.

Conic Scan-and-Cover algorithms for nonparametric topic modeling

Yurochkin M., Guha A. & Nguyen X. [Link, PDF (arXiv), Code, NIPS poster]

During summer 2016 internship at LogicBlox (supervised by Nikolaos Vasiloglou) I was working with Factorization Machines and proposed a new Bayesian model for learning high order interactions among variables in the data. The key idea is to represent interactions between variables as a hypergraph, which in turn has corresponding incidence matrix. Incidence matrix is binary and I utilized Indian Buffet Process as a prior on it, additionally proposing a new modification of the IBP to better model interactions. Factorization Machines construction came in handy to be able to estimate coefficients of previously unseen interactions on the fly. This work was presented at NIPS 2017.

Multi-way Interacting Regression via Factorization Machines

Yurochkin M., Nguyen X. & Vasiloglou N. [Link, PDF (arXiv), Code, NIPS poster]

I'm working jointly with Nhat Ho on finding interesting applications of Wasserstein distance and barycenters. While these are very elegant mathematical tools, I believe that we have not yet found a lot of important use cases and fully understood the meaning of barycenter. We published a paper in ICML 2017 about using Wasserstein distances for clustering data with multilevel structure. I was responsible for implementation and simulations design.

Multilevel clustering via Wasserstein means

Ho N., Nguyen X., Yurochkin M., Bui H., Huynh V. & Phung D. [Link, PDF (arXiv), Code]

Program Committee member: ICLR 2018, 2019, 2020; NeurIPS 2017, 2018 (top 30% reviewer award), 2019 (top 50% reviewer award); ICML 2017, 2018, 2019.