Sampling Workshop




Cambridge/Gatsby Machine Learning Workshop


Thursday 8th January 2004


Format: Each speaker is asked to lead a discussion, starting with a 20-40 minute talk highlighting issues for discussion.

To ensure that our discussions don't run out of juice, everyone who attends is asked to bring one issue or question for discussion.

Recommended reading links are included with some of the talks below. In addition, everyone attending should be familiar with the following background material: importance sampling, rejection sampling, Metropolis method, Gibbs sampling; familiarity with slice sampling and Hamiltonian Monte Carlo may also prove helpful. All these topics are reviewed in chapters 29 and 30 of DJCM's book.

Time Discussion leader Topic Recommended reading
10.00 Andrew Blake Particle filters Reading
11.00 Coffee
11.30 Iain Murray Contrastive divergence Reading
12.30 Discussion
12.45 Lunch
2.00 Yuan Qi Hessian based MCMC Reading
3.00 Zoubin Ghahramani Embedded HMMs, an idea for 'particle smoothing'. Radford Neal paper
4.00 Tea
4.30 David MacKay Exact sampling Reading
5.30 Discussion

Contrastive divergence: Training Products of Experts by Minimizing Contrastive Divergence. Hinton, G.E. Technical Report: GCNU TR 2000-004

Hessian-based MCMC: Slides (postscript) | Report (postscript, draft)

Exact sampling: please read chapter 32 of Information Theory, Inference, and Learning Algorithms. Further reading on exact sampling is available on D B Wilson's page. Our discussion might especially look at the issue of exact sampling for Bayesian inference, for example, in continuous state spaces.

For convenience, I have extracted the relevant abstracts from D B Wilson's page, and list them below. The first Murdoch/Green paper seems to me the best place to start.

Recent research papers: advanced reading on exact sampling

D. J. Murdoch and P. J. Green. Exact sampling from a continuous state space. Scandinavian Journal of Statistics 25(3):483--502, 1998. (postscript)

Abstract: Propp and Wilson (1995) described a protocol, called coupling from the past, for exact sampling from a finite distribution using a coupled Markov chain Monte Carlo algorithm. In this paper we extend coupling from the past to various MCMC samplers on a continuous state space; rather than following the monotone sampling device of Propp and Wilson, our approach uses methods related to gamma-coupling and rejection sampling to simulate the chain, and direct accounting of sample paths to check whether they have coupled.

Remarks: Describes a variety of scenarios for which CFTP may be applied to a (non-monotone) continuous state space, giving a sequence of algorithms that start out simple, but become increasingly sophisticated. The methods used are related to gamma-coupling and rejection sampling, and appear to be applicable to Bayesian parameter estimation. Includes a case study comparing the performance of each technique when used to sample from the posterior distribution of a set of pump reliability parameters.

Jesper Møller. Perfect simulation of conditionally specified models. Journal of the Royal Statistical Society, Ser. B, 61(1):251--264, 1999.

Abstract: We discuss how the ideas of producing perfect simulations based on coupling from the past for finite state space models naturally extend to multivariate distributions with infinite or uncountable state spaces such as auto-gamma, auto-Poisson and autonegative binomial models, using Gibbs sampling in combination with sandwiching methods originally introduced for perfect simulation of point processes.

Remarks: Shows how to apply CFTP to sample from a class of Markov random fields, in which the individual variables (from a finite, countable, or continuous space) depend upon one another in a monotone or anti-monotone way. For the continuous distributions, the user specifies an ε, which controls not the bias (which is 0), but instead the numerical precision of the output. The application to the auto-gamma distribution, which includes the pump-reliability application above, appears to be faster than the approach of Murdoch and Green.

For the continuous distributions there is an even faster method, which also takes the numerical error ε to zero.

The aforementioned sandwiching methods are the ones introduced for antimonotone CFTP by Kendall and Häggström-Nelander.

Morten Fismen. Exact simulation using Markov chains. Technical Report 6/98, Institutt for Matematiske Fag, 1998. Diploma-thesis. (Postscript.)

Abstract: This reports gives a review of the new exact simulation algorithms using Markov chains. The first part covers the discrete case. We consider two different algorithms, Propp and Wilsons ``coupling from the past'' (CFTP) technique and Fills rejection sampler. The algorithms are tested on the Ising model, with and without an external field. The second part covers continuous state spaces. We present several algorithms developed by Murdoch and Green, all based on ``coupling from the past''. We discuss the applicability of these methods on a Bayesian analysis problem of surgical failure rates.

Peter J. Green and Duncan J. Murdoch. Exact sampling for Bayesian inference: towards general purpose algorithms (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, editors, Bayesian Statistics 6, pages 301--321. Oxford University Press, 1999. Presented as an invited paper at the 6th Valencia International Meeting on Bayesian Statistics, Alcossebre, Spain, June 1998. Preprint.

Abstract: Propp and Wilson (1996) described a protocol, called coupling from the past, for exact sampling from a target distribution using a coupled Markov chain Monte Carlo algorithm. In this paper we discuss the implementation of coupling from the past for samplers on a continuous state space; our ultimate objective is Bayesian MCMC with guaranteed convergence. We make some progress towards this objective, but our methods are still not automatic or generally applicable.

Per author's description: Describes several coupling techniques aimed at routine application in the context of Bayesian inference using random walk Metropolis: the ``bisection coupler'', calculation of automatic cell bounds and the adaptation of the Kendall (1996)/Haggstrom et al (1996) idea of bounded CFTP. Includes simulations of the posterior in two 2-parameter blood type datasets with a Dirichlet prior, a 3 dimensional Dirichlet distribution, and a N(0,1) distribution.

Michael Harvey. (Supervised by Radford Neal) Monte Carlo inference for belief networks using coupling from the past. Master's thesis, department of computer science, University of Toronto, 1999. (Postscript, PDF.)

Abstract: A common method of inference for belief networks is Gibbs sampling, in which a Markov chain converging to the desired distribution is simulated. In practice, however, the distribution obtained with Gibbs sampling differs from the desired distribution by an unknown error, since the simulation time is finite. Coupling from the past selects states from exactly the desired distribution by starting chains in every state at a time far enough back in the past that they reach the same state at time t= 0. To track every chain is an intractable procedure for large state spaces. The method proposed in this thesis uses a summary chain to approximate the set of chains. Transitions of the summary chain are efficient for noisy-or belief networks, provided that sibling variables of the network are not directly connected, but often require more simulation time steps than would be needed if chains were tracked exactly. Testing shows that the method is a potential alternative to ordinary Gibbs sampling, especially for networks that have poor Gibbs sampling convergence, and when the user has a low error tolerance.

Bas Straatman. Exact sampling and applications to a mite dispersal model, 1998. Master's thesis, Utrecht.

This thesis does not itself come with an abstract, but the abstract for a talk given by Straatman on the topic follows.
Abstract: I have studied a Markov chain which describes the dispersal of mites on cassava fields in West Africa. Instead of obtaining a sample of this model with approximately the stationary distribution by Monte Carlo simulation, it is possible to obtain a result which is exactly distributed according to the stationary distribution. One way to do this is coupling from the past. After a short introduction of this procedure I will explain the changes I have made to the original model in order to be able to use coupling from the past and show some simulations which give exact samples.

> A. Mira, J. Møller, and G. O. Roberts. Perfect slice samplers. Journal of the Royal Statistical Society, Ser. B, 63(3):593--606, 2001. (Postscript.)

Abstract: Perfect sampling allows exact simulation of random variables from the stationary measure of a Markov chain. By exploiting monotonicity properties of the slice sampler we show that a perfect version of the algorithm can be easily implemented, at least when the target distribution is bounded. Various extensions, including perfect product slice samplers, and examples of applications are discussed.

Administrative arrangements

The workshop will be held in the Old Library, Darwin College. (Travel instructions)

In emergencies: Darwin porters lodge = 01223 335660, David MacKay's mobile is 07952 640 466.

There will be a £10 registration charge (for waged people) to cover coffee, tea, and lunch. (Students free.)

At lunch, please write workshop on your chit, and sit at the right-hand table, which will have cutlery and glasses laid out for us

Lunch menu
Fruit juice Scotch Broth
Lamb Goulash
Gnochi Parisienne
Tricolour Pasta
Caesar Salad
Fresh Pineapple Cheesecake
or (vegetarian)
Curried Vegetables
Braised Brown Rice
Salad bar

Site last modified Wed Jan 7 18:25:42 GMT 2004