Novel simulation-based statistical inference with applications to epidemic models

EPSRC Funded Grant - EP/J008443/1



Project summary

Parametric models play a key role in statistical modelling. Parametric models assume that there is an underlying model giving rise to the data we observe with the data dependent upon certain parameters and random quantities. For example, for the spread of a disease, the model parameters dictate the infectiveness of the disease but who becomes infected will depend upon the model setup and randomness. In practice, we rarely know the parameters of the model and a key element of statistics is to obtain good estimates of the parameters. In Bayesian statistics the parameters have a posterior distribution which quantifies the uncertainty in the parameters of the model. By studying the posterior distribution we can calculate any summary statistics of the parameters we are interested in. However, a major drawback of Bayesian statistics is that the posterior distribution is rarely available in a form which we can easily use. There are a number of approaches for obtaining samples from the posterior distribution, the most common of which is MCMC. Recently a range of practical problems in statistical genetics have been identified where MCMC can either not be used or it is particularly difficult to do so. A solution has been provided in the form of the ABC (approximate Bayesian computation) algorithm. The ABC algorithm uses simulation from the model with parameters chosen via an appropriate mechanism, often the prior distribution, to estimate the parameters. (The prior distribution represents our prior beliefs about the model parameters.) The ABC algorithm formalises the idea that we simulate from the model with different parameters, accepting those parameter values which lead to simulated data in close agreement with the observed data.

Both the MCMC and the ABC algorithms are iterative algorithms producing a single parameter from the posterior distribution at each iteration. Recently the investigator has introduced a new ABC algorithm which produces a set of parameters from the posterior distribution at each iteration. This new ABC algorithm is shown to be considerably more efficient than standard ABC algorithms, and has straightforwardly been applied to the analysis of epidemic models for the spread of infectious diseases. The aim of the proposed research is two-fold. Firstly, to develop more efficient MCMC and ABC algorithms which obtain sets of values from the posterior distribution. This should lead to more robust parameter estimation with lower uncertainty in parameter estimates based upon a sample of a given size. Secondly, to apply the new methods to a range of epidemic models to gain a better understanding of the spread of infectious diseases. In particular, the development of procedures which are easy to use and interpret by non-experts.


Start date: 19 November 2012
End date: 18 November 2015

PDRA: Dr Fei Xiang (November 2012 - April 2015); Dr Clement Lee (May 2015 - November 2015)



Publications
  1. Xiang, F. and Neal, P. (2014) Efficient MCMC for temporal epidemics via parameter reduction. Comp. Stat. and Data Anal. 80 , 240-250.

  2. Neal, P. and Huang, C.L.T. (2015) Forward Simulation MCMC with applications to stochastic epidemic models. Scan. J. Statist. 42 378-396.

  3. Neal, P. (2014) Simulation based sequential Monte Carlo methods for discretely observed Markov processes. Submitted

  4. Neal, P. and Xiang, F. (2016) Collapsing of non-centered parameterised MCMC algorithms with applications to epidemic models. To appear in Scan. J. Statist.

  5. Lee, C. and Neal, P. (2016) Optimal scaling of the independence sampler: Theory and Practice. To appear in Bernoulli



Computer code
  1. Xiang, F. and Neal, P. (2014)

  2. Neal, P. and Huang, C.L.T. (2014)

  3. Neal, P. (2014) Available on request.

  4. Neal, P. and Xiang, F. (2016) Available on request.

  5. Lee, T.M.C. and Neal, P. (2015) Available on request.