Inference of Recombination Rates and Detection of Hotspots

Approximate Likelihood

The program sequenceLD analyzes sequence data. It obtains an approximation to the likelihood of a summary of the data (as such it can be thought of as a marginal likelihood approach). It does not use all the information in the data, but computationally it can be substantially more efficient than the full-likelihood methods (and hence able to analyze larger data sets). More details can be found in the paper, "Approximate likelihood methods for estimating local recombination rates", JRSS series B, 2002, 64, 657-680.

An extension of this approach is described in the paper "Application of coalescent methods to reveal fine scale rate variation and recombination hotspots", Genetics, 2004, 167 2067--2081. This is used to detect recombination hotspots. The program sequenceLDsr is available for implementing this approach, (together with a function for the statistical package R). This program can also be used to analyse data via the composite likelihood method described in "Approximate likelihood methods for estimating local recombination rates", JRSS series B, 2002, 64, 657-680.

Hotspot Detection

Further R code for implementing a penalised likelihood approach for detecting hotspots using the output of sequenceLDsr is available here. These programs implement the method in "A novel method with improved power to detect recombination hotspots from polymorphism data reveals multiple hotspots in human genes" American Journal of Human Genetics, 77 781-794..

See also sequenceLDhot.

Full-likelihood Programs: Finite sites model

This program estimates the full-likelihood surface for the scaled mutation and recombination parameters for population genetic data. The program fins is for a finite sites model, with an arbitrary mutation model at each locus. (The program infs is no longer available - we recommend you use sequenceLD<\KBD> for the analysis of sequence data.)