Sam Livingstone: Non-reversible Markovian Monte Carlo from a statistics perspective.

Discussion Chair: Chris Sherlock

I will break down the tutorial into 4 parts. First I will motivate non-reversible MCMC in statistical inference problems, and give some intuition for when and why this might be a good idea. Second I will review discrete time methods and try to give a feel for the general recipes underlying their construction. Third I will introduce and review some more recent approaches based on PDMPs, and discuss some of their features. Finally I will mention some current methodological challenges and recent applications. Along the way I will give references to classic papers from the statistical literature so that those less familiar know where to look to find out more.

Michael Chertkov: Non-reversible Markovian Monte Carlo from a physics perspective.

Tutorial Chair: Marija Vucelja

Discussion Chair: Sam Livingstone

Jianfeng Lu: Ergodicity and convergence of non-reversible Markovian Monte Carlo

Discussion Chair: Manon Michel

Paper presentations, theme discussions and standard presentations

Jonathan Mattingly: Non-reversible MCMC for sampling of district maps

Chair: Greg Pavliotis

Evaluating the degree of partisan districting (Gerrymandering) in a statistical framework typically requires an ensemble of districting plans which are drawn from a prescribed probability distribution that adheres to a realistic and non-partisan criteria. In this article we introduce novel non-reversible Markov chain Monte-Carlo (MCMC) methods for the sampling of such districting plans which have improved mixing properties in comparison to previously used (reversible) MCMC algorithms. In doing so we extend the current framework for construction of non-reversible Markov chains on discrete sampling spaces by considering a generalization of skew detailed balance. We provide a detailed description of the proposed algorithms and evaluate their performance in numerical experiments.

Greg Pavliotis: Optimal Langevin Samplers

Chair: TBA

The use of Langevin-type dynamics for constructing sampling schemes has become a standard methodology in computational statistical mechanics, Bayesian statistics and related fields. There are infinitely many different dynamics that is ergodic with respect to the probability measure from which we want to sample or that has the right marginal distribution, e.g. in the case of underdamped Langevin dynamics. A natural question is how to choose the “optimal” dynamics, in the sense of minimizing the asymptotic variance and/or maximizing the rate of convergence to the target distribution. In this talk I will present a precise mathematical formulation of this optimization problem and I will discuss about recent partial results on identifying the optimal Langevin dynamics.

Andreas Eberle: Convergence bounds for Hamiltonian Monte Carlo in high dimension

Chair: Michela Ottobre

Despite its empirical success, until a few years ago there have been almost no convergence bounds for Hamiltonian Monte Carlo. This has changed in the last years where approaches to quantify convergence to equilibrium based on coupling, conductance and hypocoercivity have been developed. In this talk, I will compare upper and lower bounds on convergence rates and mixing times that follow by these different approaches. I will then focus on the coupling approach and explain how it can be used to obtain a relatively good understanding of the dimension dependence for unadjusted HMC in several high dimensional model classes. Current limitations of the approach will be pointed out as well. In particular, the correct order of the mixing time for Metropolis-adjusted HMC is still widely open even in simple scenarios.

Michela Ottobre with Greg Pavliotis: Frameworks for non-reversibility: GENERIC and hypocoercivity

Chair: Andreas Eberle

The Hypocoercivity theory of Villani (2009) has provided a robust framework for the analysis of non-reversible processes. Such a theory splits the generator of the process into a symmetric and an anti-symmetric part and quantifies the way in which the interaction between the two leads to exponentially fast convergence to equilibrium. GENERIC (General equations for non-equilibrium reversible-irreversible coupling) does a similar job, i.e. it splits the dynamics into a gradient flow (symmetric) and a non gradient flow (anti-symmetric) part. However, unlike the hypocoercivity theory (HT), it is still mostly a physics theory, not completely mathematised, and certainly not exploited by the mathematics community to the same extent as the HT. We will show how these two formalisms are substantially equivalent. However, while the HT puts more emphasis on "macroscopic" description of non-reversibility (i.e. exponentially fast convergence to equilibrium), GENERIC sheds light on "microscopic" understanding of non-reversibility, and on relations with entropy dissipation and large deviation principles. We will discuss to what extent one can exploit the equivalence of these two formalisms. The discussion will be mostly based on works of M. Peletier, A. Mielke and collaborators as well as on a preprint of myself with H. Duong.

Manon Michel and Alain Durmus: Discussion across Mathematics and Physics on using symmetries to design non-reversible MCMC algorithms

Chair: Jianfeng Lu

This session will aim at giving some global perspectives into non-reversible MCMC developments through the exploitation of symmetries.

First, we will present how in the non-reversible event-chain Monte Carlo method, the reversibility symmetry of detailed balance is replaced by ones of the target probability distribution itself, so as to enforce the correct invariant distribution of the underlying piecewise deterministic Markov process. Historically, this interplay between global moves and symmetries has its roots all the way back to cluster algorithms in lattice spin systems, which crucially rely on the spin involution symmetry. Today, the same effort to exploit global symmetries can lead up to efficient implementation even in statistical inference.

Second, we will discuss how to formally frame up this link between sampling and symmetries. Indeed, while standard implementations of the Metropolis-Hastings (MH) kernel are well understood, MH kernels explicitly using symmetries, in particular the non-reversible schemes, have not yet received the same systematic treatment to ensure their validity. We will then introduce general tools to ensure that a class of non-reversible Markov kernels has the desired invariance property and lead to convergent algorithms.

This session is open to the contributions of other participants, who can contact Manon Michel and Alain Durmus beforehand if they please or simply use the discussion time to enhance the discussion about the link between symmetries, non-reversibility and sampling.

Michael Chek Hin Choi: From reversiblizations of non-reversible Markov chains to landscape modification

Chair: TBA

In the first part of the talk, we provide a brief literature survey on various reversiblization techniques for analyzing non-reversible Markov chains. This include the additive, multiplicative and Metropolis-Hastings reversiblizations, their spectral and geometric relationships, and how these reversiblizations offer insights to the analysis of the non-reversible Markov chains. In the second part of the talk, inspired by the reversiblization techniques in the first part, we develop a new acceleration method that we call landscape modification, which can possibly be applied to accelerate non-reversible Markov chain Monte Carlo algorithms to speedup sampling or optimization.

  • L.J. Billera, P. Diaconis. A geometric interpretation of the Metropolis–Hastings algorithm. Stat. Sci. 16(4) (2001) 335–339.

  • M. C.H. Choi. Metropolis-Hastings reversiblizations of non-reversible Markov chains. Stoch. Process. Appl. 130(2) (2020) 1041–1073.

  • M. C.H. Choi. On the convergence of an improved and adaptive kinetic simulated annealing. arXiv preprint arXiv:2009.00195, 2020.

  • M. C.H. Choi. On the convergence of an improved discrete simulated annealing via landscape modification. arXiv preprint arXiv:2011.09680, 2020.

  • H. Fang, M. Qian, and G. Gong. An improved annealing method and its large-time behavior. Stochastic Process. Appl., 71 (1) (1997) 55–74.

  • J.A. Fill. Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process. Ann. Appl. Probab. 1 (1) (1991) 62–87.

  • D. Paulin. Concentration inequalities for Markov chains by Marton couplings and spectral methods. Electron. J. Probab. 20 (79) (2015) 1–32.

Luc Rey-Bellet: Concentration and robustness for MCMC

Chair: Michael Choi

On one hand we will present some new concentration inequalities for the empirical distribution for hypocoercive MCMC (e.g. the bouncy particle sampler or Hamiltonian MCMC) and discuss their relations with spectral gaps or exponential ergodicity. On the other hand we will also discuss, in a broader context, the relation between concentration equalities and robustness properties for steady state expectations. By robustness we mean here the robustness of predictions from your model (in terms of expected values) when there are uncertainties in the model itself.

Werner Krauth: Founding papers on non-reversible Markov chain Monte Carlo

Chair: Gareth Roberts

The main speaker will present for around \(35\) minutes; contributions from the floor are then invited for the remaining \(25\) minutes.

This session will discuss the founding papers listed below (1,2,3) and their implications in practice (4).

  1. Diaconis P, Holmes S, Neal RM. Analysis of a nonreversible Markov chain sampler. Ann Appl Probab. (2000) 10:726–52. doi: 10.1214/aoap/1019487508

  2. Chen F, Lovász L, Pak I. Lifting Markov chains to speed up mixing. In: Proceedings of the 17th Annual ACM Symposium on Theory of Computing. Providence, RI (1999). p. 275. doi: 10.1145/301250.301315

  3. Baik J, Liu Z. TASEP on a ring in sub-relaxation time scale. J Stat Phys. (2016)165:1051–85. doi: 10.1007/s10955-016-1665-y

  4. Kapfer SC, Krauth W. Irreversible local Markov chains with rapid convergence towards equilibrium. Phys Rev Lett. (2017) 119:240603. doi: 10.1103/PhysRevLett.119.240603

Gareth Roberts: The Restore Algorithm, practical non-reversibility without lifting

Chair: Werner Krauth

We develop a new class of non-reversible Markov processes comprising local dynamics governed by a fixed Markov process, which are enriched with regenerations from a fixed distribution at a state-dependent rate. We give conditions under which such processes possess a given target distribution as their invariant measures, thus making them amenable for use within Monte Carlo methodologies. Since by construction, the resulting process is a regenerative one, its theoretical study is simpler than many other non-reversible algorithms. We show uniform ergodicity results under simple and easy-to-check conditions, and show that the target density is robust to certain types of perturbation of the state-dependent regeneration rate. A (read-once) coupling from the past implementation of the algorithm yielding independent draws from the target density can also be constructed. This is joint work with Murray Pollock, Andi Wang and David Steinsaltz.

Jesus Maria Sanz-Serna: Wasserstein distance estimates for the distributions of numerical approximations to ergodic stochastic differential equations.

Chair: Moritz Schauer

I shall present a framework for the non-asymptotic study of the \(2\)-Wasserstein distance between the invariant distribution of an ergodic stochastic differential equation and the distribution of its numerical approximation in the strongly log-concave case. This allows to study in a unified way a number of different integrators proposed in the literature for the overdamped and underdamped Langevin dynamics. In addition, I shall analyse a novel splitting method for the underdamped Langevin dynamics which only requires one gradient evaluation per time step.

Under an additional smoothness assumption on a \(d\)--dimensional strongly log-concave distribution with condition number \(\kappa\), the algorithm is shown to produce with an \(\mathcal{O}\big(\kappa^{5/4} d^{1/4}\epsilon^{-1/2} \big)\) complexity samples from a distribution that, in Wasserstein distance, is at most \(\epsilon>0\) away from the target distribution.

Joint work with Kostas Zygalakis, Edinburgh.

Moritz Schauer: Localisation and design of piecewise deterministic Monte Carlo samplers

Chair: Jesus Maria Sanz-Serna

In this talk I want to take a more detailed look at the localisation properties governing Markov Chain Monte Carlo samplers for structured and trans-dimensional targets based on a thinning procedure. These are the factorisation of the flow into (pairs of position and velocity) coordinates on disjoint components, the local dependence of the bounding event rates for reflections on those coordinates and similarly, the time to the boundary of the components. Properly mapping this dependency allows scalable, thread-parallel (asynchronous) sampling of PDMP processes, which I will illustrate with a localised version of the sticky ZigZag sampler. Finally, I want to give a broader perspective about our efforts to make continuous-time Markov Chain Monte Carlo methods based on PDMPs available to the Julia language ecosystem.

Persi Diaconis: Speeding up Markov Chains

Chair: Christophe Andrieu

I will review a host of methods that yield exponential speedups for Markov Chain Monte Carlo. These range from interspersing deterministic steps to adding a few edges to the graphs underlying the chain. Many of these have been developed for uniform stationary distributions. I'll offer an approach for non-uniform distributions as well. Some of the math is surprising (and some is surprisingly difficult).