This HTML presentation is intended to be viewed in a web browser or device which can load external libraries. In particular, it requires slides.js, slides.css and MathJax. \[ \def\set#1{\class{hl}{\mathcal{#1}}} \def\bset#1{\class{hl}{\left\{#1\right\}}} \def\ensuremath#1{#1} \def\civ#1#2{\class{red}{[#1,#2]}} \def\mul{\cdot} \def\assign{\leftarrow} \def\next#1{\text{next}\left(#1\right)} \def\mdots{.\!.\!.} \def\|{\ \middle|\ } \def\p#1{\text{P}\left(#1\right)} \def\cuma#1{\class{hl}{{#1}_{\Sigma}}} \def\dist#1{\class{hl}{\text{#1}}} \def\Uniform#1{\dist{Uniform}\left(#1\right)} \def\one#1{\mathbb{\mathbf{1}}\!\left[{#1}\right]} \def\setsize#1{\ensuremath{\left|#1\right|}} \def\PP#1{P\left(#1\right)} \def\QQ#1{Q\left(#1\right)} \def\txtPP{P} \def\txtQQ{Q} \]

Advanced Sampling
A Slide Presentation on advanced sampling methods.
(Christian Steinruecken, Tomoharu Iwata)

Use arrow keys, PgUp, PgDn to move between slides, or press 'dot' to toggle the toolbar.
Typing 'HELP' + return jumps to a help screen.

click me

A histogram sampled from a negative binomial distribution.
\[ G \sim \text{DP}(\alpha=50, G_0 = \text{Uniform}) \]


Why sampling?

"That's what the brain does."
-- Máté

Looking at samples from a model can help to understand the model's properties.

Probabilistic Inference

\[ \PP{x \| y}\ =\ \frac{\PP{y \| x} \mul \PP{x}}{\PP{y}} \] where: \[ \PP{y}\ =\ \int \PP{y \| x} \PP{x}\ dx\ =\ Z \]
Cunningham (2011)

Approximate Inference Taxonomy

Many problems of interest in inference can be written as an integral of type:

\( \displaystyle \int f(x,y)\ dx \)
In practice, these integrals can rarely be evaluated exactly.
Cunningham (2011)

Approximate Inference Taxonomy

\( \displaystyle \int f(x,y)\ dx \)

\(\displaystyle \approx \sum_i f(x^{(i)},y) \)
  1. Replace hard integrals with summations.
  2. Sampling methods
  3. Central problem: how to sample \(x^{(i)}\)
  4. Monte Carlo, MCMC, Gibbs, etc.
\(\displaystyle \approx \int g(x,y) \ dx \)
  1. Replace hard integrals with easier integrals.
  2. Message passing on factor graph
  3. Central problem: how to find \(g(\cdot) \in \set{G}\)
  4. VB, EP, etc.
\(\displaystyle \approx \int h(x,y; x^*) \ dx \)
  1. Replace hard integrals with estimators.
  2. "Non-Bayesian" methods
  3. Central problem: how to find \(x^*\)
  4. MAP, ML, Laplace, etc.


Samplers are mechanisms for obtaining random samples of a chosen probability distribution.


Where does randomness come from?

Sources of natural randomness:

There are many kinds of hardware random number generators. E.g. the UK government's ERNIE. Thomson (1959)

Random sampling: hardware solution

A radioactive Caesium-137 probe next to a Geiger-Müller detector.
\[ ^{137}\mathrm{Cs}\ \stackrel{\scriptscriptstyle 30.17\mathrm{y}}{\longrightarrow}\ ^{137\mathrm{m}}\mathrm{Ba}+\beta^-+\overline{\nu_e}\ \stackrel{\scriptscriptstyle 156\mathrm{s}}{\longrightarrow}\ ^{137}\mathrm{Ba}+\gamma \] The number of β-particles arriving in fixed time intervals is Poisson-distributed.
Interface this to a serial port and extract the randomness with a computer.
Instructions: Walker (1996).

Randomness extraction

Natural phenomena usually produce weakly random or biased random events.

A randomness extractor attempts to attempts to convert weak randomness into strong unbiased random numbers.

In general, this can be surprisingly difficult. Gifford (1988)


Suppose you're given a series of coin flips of unknown bias \(\theta\).
Can you use these to create fair random bits?

Given a supply of fair random bits, can you create fair dice rolls?
\(r \in\) { ⚀ , ⚁ , ⚂ , ⚃ , ⚄ , ⚅ }


Suppose a sequence of people leaving a train. How to choose a person uniformly at random, if the total number of people is not known in advance?

Reservoir Sampling

A reservoir algorithm samples \(K\) elements uniformly at random from a sequence \(x_1 \dots x_N\) of unknown length \(N\), using at most 1 pass through the sequence.


Algorithm R, published in chapter 3.4.2 of Knuth (1969).
More detailed treatment is given by Vitter (1985) and also Devroye (1986).

Simplest case: \(K=1\).
Vitter (1985)

Reservoir Sampling: Code

  /** Reservoir-samples an element, uniformly at random.
    * Reservoir sampling has linear time complexity, but avoids
    * the need to compute the collection's size in advance.
    * If the specified collection is empty, null is returned.
    * @param rnd random source
    * @param col iterable collection of elements
    * @return a uniformly random element */
  public static <X> X rsvsample(Random rnd, Iterable<X> col) {
    X e = null;
    int k = 0;   /* elements seen so far */
    for (X x : col) {
      e = (e == null) ? x : (rnd.nextInt(k) == 0 ? x : e);
    return e;

Sources of deterministic randomness

Pseudo-random. Basically "non-random", but good enough in practice.


Sampling algorithms

Metropolis & Ulam (1949)

Monte-Carlo Methods

Area under the curve method

Sample points uniformly at random from the area under the curve.
Discard \(y\) coordinates, and return \(x\) coordinates as a fair samples.

Inverse CDF method

Sampling from discrete distributions

Suppose we have some discrete distribution:

\(\p{x}\) over elements \(x \in \set{X}\)

and we want millions of samples from it.

Linear sampling

This method implicitly computes the cumulative distribution \(\cuma{P}(x)\).

Time complexity: O(N).

Decision tree

Time complexity: O(log N).

Array method

Suppose the following distribution over \(\bset{\text{A,B,C}}\):
To sample from this distribution:

Array method

Proof of concept that it's possible to sample faster than O(log N).
We can do better... :-)
Walker (1977)

Alias Sampling

We can build a fast sampler by modulating the amplitude of a uniform distribution.
Walker (1977)

Alias Sampling

Given a distribution \(\dist{P}\) over \(N\) elements \(x_1 \mdots x_N \in \set{X}\):

Walker (1977)

Alias Sampling: Summary

It's the fastest known method for sampling from discrete distributions.

Further reading:

Arithmetic Coding

Arithmetic coding offers a way of deterministically transducing between probability distributions.

This is used e.g. in data compression to transduce between:
Arithmetic decoding of uniform random bits generates independent random samples of the target distribution.

More about arithmetic coding: data compression slides (2012)

Using a proposal distribution

We want independent samples \(x_1 \dots x_N\) from a distribution \(\txtPP\).

\(\PP{x}\) difficult distribution, cannot sample directly
\(\QQ{x}\) simple distribution, can sample directly

We can use \(\txtQQ\) to help us sample from \(\txtPP\).

Rejection Sampling

  1. Sample \(x \sim \txtQQ\)
  2. Sample \(u\) uniformly between \([0 , c \mul \QQ{x} ]\)
  3. If \(u \lt \PP{x}\), accept. Otherwise, reject and go back to step 1.

Importance Sampling

Rejecting samples seems wasteful, if we only want an expectation. Importance sampling solves this as follows: \[ \Phi = \int \PP{x}\ f(x)\ dx = \int \class{red}{\QQ{x}} \frac{\PP{x}} {\class{red}{\QQ{x}}} f(x)\ dx \] Method:
  1. Draw samples \(x_1 \dots x_N \sim \txtQQ\)
  2. Compute:
    \(\displaystyle \hat{\Phi} = \frac1{\class{blue}{W}} \mul \sum_{n=1}^N f(x_n) \mul \frac{\PP{x_n}}{\QQ{x_n}} \) where \(\displaystyle \class{blue}{W = \sum_{n=1}^N \frac{\PP{x_n}}{\QQ{x_n}}} \)

Importance Sampling

The distributions \(\txtPP\) or \(\txtQQ\) need not be normalised for importance sampling to work. \[ \hat{\Phi}\ =\ \frac{1}{N} \sum_{n=1}^N f(x_n) \mul \frac{Z_Q}{Z_P} \mul \frac{\PP{x}^*}{\QQ{x}^*} \] But note that: \[ \begin{eqnarray*} \frac{Z_P}{Z_Q} &\ =\ & \int \QQ{x} \frac{\PP{x}^*}{\QQ{x}^*} \ dx \\ &\ \approx\ & \frac{1}{N} \mul \sum_{n=1}^N \frac{\PP{x_n}^*}{\QQ{x_n}^*} \end{eqnarray*} \]
Marsaglia & Tsang (1984)

The Ziggurat algorithm

\(\def\xold{x_{\text{old}}} \def\xnew{x_{\text{new}}} \def\xprop{x} \) Metropolis (1953)

The Metropolis Method

The Metropolis Method uses a conditional proposal distribution \(\QQ{ x \|x'}\) which depends on a previous sample \(x'\).

  1. Draw \(\left. x \| \xold \right. \sim \QQ{ \xold } \).
  2. Compute acceptance factor \(\displaystyle a = \frac{\PP{x}}{\PP{\xold}}\)
  3. If \(a \ge 1\), then accept \( \xnew \assign x \).
    If \(a \lt 1\), then accept with probability \(a\),
    otherwise reject by setting \( \xnew \assign \xold \).

Hastings (1970) Metropolis (1953)


For the Metropolis method to work, the proposal distribution must be symmetric:

\[ \QQ{x \| x'} \ =\ \QQ{x' \| x} \]
However, for non-symmetric \(\txtQQ\), we can correct for the bias by adjusting the acceptance factor:
\[ a \ =\ \frac{\PP{\xprop}}{\PP{\xold}} \mul \class{hl}{ \frac{\QQ{\xold \| \xprop}}{\QQ{\xprop \| \xold}} } \]

Hastings (1970) Metropolis (1953)


Some remarkable properties:
Important caveats:
Hastings (1970) Metropolis (1953)


Samples from a Metropolis-MCMC chain are not independent, but it is not necessary to thin them if we want an expectation.

Simply compute: \[ \hat{\Phi} \ =\ \frac{1}{N} \sum_{n=1}^N f(x_n) \] over all samples.

(Still, the chain must run long enough.)
Geman & Geman (1984)

Gibbs Sampling

Algorithm:\(\def\rr#1{\class{magenta}{#1}} \def\gg#1{\class{hl}{#1}}\)
  1. \(\gg{x_{1}^{(\tau+1)}} \sim \PP{ x_{1} \| \rr{ x_{2}^{(\tau)}}, \rr{ x_{3}^{(\tau)} }, \rr{\cdots}, \rr{ x_{M}^{(\tau)}} } \)
  2. \(\gg{x_{2}^{(\tau+1)}} \sim \PP{ x_{2} \| \gg{ x_{1}^{(\tau+1)}}, \rr{x_{3}^{(\tau)}}, \rr{\cdots}, \rr{x_{M}^{(\tau)}} } \)
  3. \(\vdots\)
  4. \(\gg{x_{M}^{(\tau+1)}} \sim \PP{ x_{M} \| \gg{x_{1}^{(\tau+1)}}, \gg{x_{2}^{(\tau+1)}}, \gg{\cdots}, \gg{x_{M-1}^{(\tau+1)}} } \)

Geman & Geman (1984)

Gibbs Sampling

Hamiltonian Monte Carlo

Problems in Metropolis sampling

MacKay (2003)

HMC (Hamiltonian or Hybrid Monte Carlo)

Hamiltonian Monte Carlo

\(E(x)\) is the potential energy

Hamiltonian Monte Carlo: Procedures

  1. Draw a momentum from Gaussian: \(r\sim\exp(-K(r))/Z_K\)
  2. Simulate Hamiltonian dynamics by \(L\) leapfrog steps \[ r(t+\epsilon/2)=r(t)-\frac{\epsilon}{2}\frac{\partial E(x(t))}{\partial x} \] \[ x(t+\epsilon)=x(t)+\epsilon r(t+\epsilon/2) \] \[ r(t+\epsilon)=r(t+\epsilon/2)-\frac{\epsilon}{2}\frac{\partial E(x(t+\epsilon))}{\partial x} \]
  3. Decide whether to accept the state \((x',r')\) after the leapfrog steps \[ \min(1,\exp[H(x,r)-H(x',r')]) \]

Hamiltonian Monte Carlo: Summary

Parallel Tempering

Sampling from a multi-modal distribution

[Hukushima, 2003]
Swendsen and Wang (1986)

Parallel Tempering

Parallel Tempering

Parallel Tempering: Summary

Particle Filter

Caesarendra et al. 2010

Particle Filter: Procedure

  1. Draw \(S\) particles from the proposal \[ x_k^{(s)} \sim \pi(x_k|x_{1:k-1}^{(s)},y_{1:k-1}) \]
  2. Update the importance weights, and normalize them \[ w_{k}^{(s)}=w_{k-1}^{(s)}\frac{p(y_k|x_{k}^{(s)})p(x_{k}^{(s)}|x_{k-1}^{(s)})}{\pi(x_k|x_{1:k-1}^{(s)},y_{1:k-1})} \]
  3. Resample particles from the current particle set proportional to their weights

Particle Filter: Summary

Neal (2000)

Slice Sampling

An MCMC instance of the area under the curve method:

Given a point \(x\), samples a \(y\)-coordinate uniformly between \(0\) and \(\PP{x}\). Samples a new point \(x'\) from the slice through the distribution at height \(y\).
Neal (2000)

Slice Sampling: Code

/* $Id: sampling.html,v 1.59 2013/08/13 21:16:23 chris Exp $ */
/* Author: Christian Steinruecken */

import java.util.Random;
import java.util.Vector;
import java.util.Collection;

/** Implements a slice sampler.
  * <p>Slice sampling is a Markov chain Monte Carlo method which
  * produces samples from a probability density <var>f</var>.
  * This is done by constructing a uniform random walk over the
  * <i>area</i> under <var>f</var>.
  * (For details, please see the techreport cited below.)</p>
  * <p>This implementation works for distributions on the 1-dimensional
  * real line.  Bounding intervals for the slices are found by starting
  * from an initial interval of length <var>w</var>, which is repeatedly
  * doubled until both ends of the interval lie outside the area under
  * <var>f</var>.</p>
  * <b>References:</b><ul>
  * <li><a name="neal2000b">Radford M. Neal.&nbsp; <i>Slice Sampling,</i>
  * 2000-08-29. Technical Report, No. 2005. Departments of Statistics and
  * Computer Science, University of Toronto.
  * [<a href="">PDF</a>]
  * </a></li></ul>
  * @see Metropolis
  * @see RejectSampler */
public class SliceSampler implements Sampler<Double> {

  /** Target distribution. */
  Distribution<Double> dist = null;

  /** Step size. */
  double w = 1.0;

  /** Maximal number of times a slice's bounding interval may be doubled.
    * This setting limits the maximum slice width to
    * <var>w</var>*2^<var>maxk</var>. */
  int maxk = 64;

  /** Current sample. */
  Double x = null;

  /** Current slice. */
  double y = -1;

  /** Lower end of bounding interval. */
  double a = -w/2;

  /** Upper end of bounding interval. */
  double z = +w/2;

  /** Constructs a new interval doubling slice sampler for
    * distribution <i>dist</i>, starting point <var>x</var>
    * and initial interval size <var>w</var>.
    * @param dist distribution to be sampled from
    * @param w initial interval size (typical slice width)
    * @param x starting point */
  public SliceSampler(Distribution<Double> dist, Double x, double w) {
    assert (w > 0.0);
    this.dist = dist;
    this.x = x;
    this.w = w;

/** Constructs a new interval doubling slice sampler for * distribution <i>dist</i> and starting point <var>x</var>. * @param dist distribution to be sampled from * @param x starting point */ public SliceSampler(Distribution<Double> dist, Double x) { this.dist = dist; this.x = x; } /** Checks acceptance of a proposed point <var>newx</var> * when interval (<var>a</var>,<var>z</var>) was found by * doubling procedure. */ protected boolean acceptable(double newx) { // current point is still x boolean det = false; while (z-a > 1.1*w) { double m = (a+z)/2; if ((x < m && newx >= m) || (x >= m && newx < m)) { det = true; } if (newx < m) { a = m; } else { z = m; } if (det && y >= dist.p(z) && y >= dist.p(a)) { return false; } } return true; } /** This method finds a bounding interval of the current * slice by Neal's "doubling" procedure. */ protected void doubleBounder(Random rnd) { // find slice bounds (by "doubling" procedure) // position a window [a,z] of size w randomly around x a = x - Samplers.uniform(rnd,w); z = a+w; // probability density at the endpoints double ap = dist.p(a); double zp = dist.p(z); // expand until slice [a,z] is acceptable int k = maxk; while (k > 0 && (y < ap || y < zp)) { if (Samplers.flip(rnd)) { a -= z-a; ap = dist.p(a); } else { z += z-a; zp = dist.p(z); } k--; } } /** This method finds a bounding interval of the current * slice by Neal's "stepping" procedure. */ protected void stepBounder(Random rnd) { // position a window [a,z] of size w randomly around x a = x-Samplers.uniform(rnd,w); z = a+w; // probability density at the endpoints int m = 1000; int j = rnd.nextInt(m); int k = m-1-j;
while (j>0 && y < dist.p(a)) { a -= w; j--; } while (k>0 && y < dist.p(z)) { z += w; k--; } } /** Get a sample using slice sampling. */ public Double sample(Random rnd) { // current point: x // sample y uniformly between 0 and p(x) y = Samplers.uniform(rnd)*dist.p(x); // find slice bounds //stepBounder(rnd); doubleBounder(rnd); //a = Double.NEGATIVE_INFINITY; //z = Double.POSITIVE_INFINITY; // now sample a point on slice [a,z] Double newx = Samplers.uniform(rnd,a,z); while (dist.p(newx) <= y || !acceptable(newx)) { //while (dist.p(newx) < y) { //System.out.println("["+a+","+z+"] x="+x+" newx="+newx); // shrink interval if (newx < x) { a = newx; } else { z = newx; } // resample newx newx = Samplers.uniform(rnd,a,z); } //System.out.println("["+a+","+z+"] newx="+newx // +", shrinks: "+shrinks+", grows: "+grows); x = newx; return newx; } /** Get samples using <var>n</var> steps of slice sampling. * This method runs for <var>n</var> iterations and adds * all obtained samples to the specified collection. */ public void sample(Random rnd, int n, Collection<Double> samples) { for (int j=0; j<n; j++) { samples.add(sample(rnd)); } } /** Runs a simple slice sampling demo. */ public static void main(String[] args) { Random r = new Random(); Gaussian g1 = new Gaussian(-2.0, 1); Gaussian g2 = new Gaussian(2.0, 1); Mixture<Double> d = new Mixture<Double>(0.2,g1,g2); //Beta dist = new Beta(4.5,1.75); //LogitBeta dist = new LogitBeta(0.5,0.5); double range_a = -5.0; // the interval we're histogramming double range_z = +5.0; int n = 100000; // number of samples System.out.println("SLICE SAMPLING DEMO"); System.out.println("Distribution: "+d); System.out.println(); Histogram hd = Histogram.fromDoubles(d, range_a, range_z, 70, n); System.out.println("Sampled directly ("+n+" samples):"); hd.print(System.out,8); System.out.println(); SliceSampler ss = new SliceSampler(d, -2.0, 10); System.out.println("Sampled with slice sampling ("+n+" steps):"); Histogram hr = Histogram.fromDoubles(ss, range_a, range_z, 70, n); hr.print(System.out,8); } }

Perfect Sampling

Given a source of random bits, a perfect sampler constructs provably independent samples.

Methods for getting perfect samples:
Propp & Wilson (1996)

Coupling from the past

A method for getting an exact sample from an ergodic Markov chain (e.g. Metropolis MCMC).

Propp & Wilson (1996)

Coupling from the past

How to simulate a chain from the infinite past:

Propp & Wilson (1996)

Coupling from the past

Mira & al. (2001)

Perfect Slice Samplers

Combines coupling from the past with slice sampling.

Estimating Z

Skilling (2006) Skilling (2004)

Nested Sampling

See also MacKay (2006) for a a seminar presentation on Nested Sampling.


Tutorial Summary:

Further reading

Freely available text books:
Other presentations:


Luc Devroye.  Non-Uniform Random Variate Generation, 1986. Book. School of Computer Science, McGill University. Springer. [Website] [PDF] Luc Devroye.  Random Sampling, 1986. Chapter 12 in Non-Uniform Random Variate Generation. School of Computer Science, McGill University. Springer. [PDF] David J. C. MacKay.  Information Theory, Inference, and Learning Algorithms, 2003. Book. Cambridge University Press. ISBN 978-0-521-64298-9. [Website] [PDF] Nicholas Metropolis; Stanislaw Ulam.  The Monte-Carlo Method, 1949. In Journal of the American Statistical Association, Vol. 44, pp. 335-341. ASA. ISSN 0162-1459. [PDF] John von Neumann.  Various Techniques used in Connection with Random Digits, 1951. In National Bureau of Standards, Applied Math Series, Vol. 11, pp. 36-38. ISSN 1049-4685. David K. Gifford.  Natural Random Numbers, 1988-08. No. MIT-LCS-TM-371. MIT. Lecture Notes. [PDF] Lenore Blum; Manuel Blum; Michael Shub.  A Simple Unpredictable Pseudo-Random Number Generator, 1986-05. In SIAM Journal on Computing, Vol. 15, No. 2, pp. 364-383. SIAM. ISSN 0097-5397. Makoto Matsumoto; Takuji Nishimura.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator, 1998-01. In ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, pp. 3-30. ACM. ISSN 1049-3301. [PDF] Yadolah Dodge.  A Natural Random Number Generator, 1996-12. In International Statistical Review / Revue Internationale de Statistique, Vol. 64, No. 3, pp. 329-344. International Statistical Institute (ISI). [download] George Marsaglia; Wai Wan Tsang.  A fast, easily implemented method for sampling from decreasing or symmetric unimodal density functions, 1984. In SIAM Journal on Scientific and Statistical Computing, Vol. 5, No. 2, pp. 349–359.
Superseded by Marsaglia & Tsang (2000).
George Marsaglia; Wai Wan Tsang.  The Ziggurat Method for Generating Random Variables, 2000. In Journal of Statistical Software, Vol. 5, No. 8. [download] Alastair J. Walker.  An Efficient Method for Generating Discrete Random Variables with General Distributions, 1977-09. In ACM Transactions on Mathematical Software, Vol. 3, No. 3, pp. 253-256. ISSN 0098-3500. [download] J. H. Ahrens; U. Dieter.  An alias method for sampling from the normal distribution, 1989. In Computing, Vol. 42, No. 2, pp. 159-170. Springer Wien. ISSN 0010-485X. George Marsaglia; Wai Wan Tsang; Jingbo Wang.  Fast Generation of Discrete Random Variables, 2004-07. In Journal of Statistical Software, Vol. 11. [download] George S. Fishman; L. Stephen Yarberry.  Generating a sample from a k-cell table with changing probabilities in O(log₂k) time, 1993-06. In ACM Transactions on Mathematical Software (TOMS), Vol. 19, pp. 257-261. ACM, New York, NY, USA. ISSN 0098-3500. [URL] Donald Ervin Knuth.  The Art of Computer Programming, Volume II: Seminumerical Algorithms, 1969. Book, edition 1. Addison Wesley Longman. ISBN 978-0-201-03802-6. Superseded by Knuth (1981). Donald Ervin Knuth.  The Art of Computer Programming, Volume II: Seminumerical Algorithms, 1981. Book, edition 2. Addison Wesley Longman. ISBN 978-0-201-03822-4. Superseded by Knuth (1997). Donald Ervin Knuth.  The Art of Computer Programming, Volume II: Seminumerical Algorithms, 1997. Book, edition 3. Addison Wesley Longman. ISBN 978-0-201-89684-8. Jeffrey Scott Vitter.  Random Sampling with a Reservoir, 1985-03. In ACM Transactions on Mathematical Software (TOMS), Vol. 11, No. 1, pp. 37-57. ACM, New York, NY, USA. [PDF] Nicholas Metropolis; Arianna W. Rosenbluth; Marshall N. Rosenbluth; Augusta H. Teller; Edward Teller.  Equation of State Calculations by Fast Computing Machines, 1953-06. In The Journal of Chemical Physics, Vol. 21, No. 6, pp. 1087-1092. ISSN 0021-9606. [PDF] W. K. Hastings.  Monte Carlo sampling methods using Markov chains and their applications, 1970. In Biometrika, Vol. 57, No. 1, pp. 97-109. Biometrika Trust. ISSN 0006-3444. [PDF] Stuart Geman; Donald Geman.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images, 1984-11. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, No. 6, pp. 721-741. ISSN 0162-8828. [PDF] W. E. Thomson.  ERNIE -- A Mathematical and Statistical Analysis, 1959. In Journal of the Royal Statistical Society, Series A (General), Vol. 122, No. 3, pp. 301-333. Wiley. ISSN 0035-9238. [download] Radford M. Neal.  Slice Sampling, 2000-08-29. Technical Report, No. 2005. Departments of Statistics and Computer Science, University of Toronto. [PDF] James G. Propp; David B. Wilson.  Exact sampling with coupled Markov chains and applications to statistical mechanics, 1996. In Random structures and Algorithms, Vol. 9, No. 1-2, pp. 223-252. John Wiley & Sons. ISSN 1098-2418. [PS]
See also: Propp & Wilson (1997)
James G. Propp; David B. Wilson.  Coupling from the Past: a User's Guide, 1997. In Microsurveys in Discrete Probability, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 41, pp. 181-192, ed. David Aldous; James Propp. AMS. ISBN 978-0-8218-0827-6. [PS] James G. Propp; David B. Wilson.  How to Get a Perfectly Random Sample from a Generic Markov Chain and Generate a Random Spanning Tree of a Directed Graph, 1998-05. In Journal of Algorithms, Vol. 27, No. 2, pp. 170-217. Academic Press. [download] John Skilling.  Nested Sampling, 2004-11. In AIP Conference Proceedings, Vol. 735, No. 1, pp. 395-405, ed. Rainer Fischer; Roland Preuss; Udo von Toussaint. AIP. John Skilling.  Nested Sampling for Bayesian Computations, 2006-06-05. In Bayesian Analysis, Vol. 1, No. 4, pp. 833-860. International Society for Bayesian Analysis. ISSN 1931-6690. [PDF] David J. C. MacKay; Iain Murray; John Skilling.  Nested Sampling, slide presentation, 2006-11-02. Newton Institute, CMS, University of Cambridge. [PDF] [MP3] A. Mira; J. Møller; G. O. Roberts.  Perfect Slice Samplers, 2001. In Journal of the Royal Statistical Society, Series B (Statistical Methodology), Vol. 63, No. 3, pp. 593-606. ISSN 1369-7412. [PDF] John Walker (1996). HotBits. Instructions how to build your own radioactive random number generator. Wahyu Caesarendra, Gang Niu, Bo-Suk Yang, Machine condition prognosis based on sequential Monte Carlo method, Expert Systems with Applications, Volume 37, Issue 3, 15 March 2010, Pages 2412-2420. Radford Neal (2011). MCMC for Using Hamiltonian Dynamics. In S Brooks, A Gelman, G Jones, M Xiao-Li (eds.), Handbook of Markov Chain Monte Carlo, pp. 113-162. Chapman & Hall, Boca Raton, FL. i Robert H. Swendsen; J. S. Wang.  Replica Monte Carlo simulation of spin-glasses, 1986. In Physical Review Letters, Vol. 57, No. 21, pp. 2607--2609. APS. Robert H. Swendsen; J. S. Wang.  Nonuniversal critical dynamics in Monte Carlo simulations, 1987. In Physical Review Letters, Vol. 58, pp. 86-88. Iain Murray.  Advanced MCMC methods, 2006. Machine Learning, Advanced Tutorial Lecture Series, Department of Engineering. [PDF] John Cunningham and David Knowles.  Approximate Inference, 2011-12-08. Slide presentation, Machine Learning RCC. CBL Lab, Engineering Department, University of Cambridge [PDF] Cristina Savin.  Extended ensemble Monte Carlo, 2012-03-15. Slide Presentation, Machine Learning RCC. [PDF]

Technical Information