Monte Carlo Method

Mosegaard, Klaus

doi:10.1007/978-3-030-26050-7_431-2

Monte Carlo Method

Klaus Mosegaard⁷

Living reference work entry
Latest version View entry history
First Online: 05 August 2022

32 Accesses

Part of the book series: Encyclopedia of Earth Sciences Series ((EESS))

Download reference work entry PDF

Definition

Any random process can, in principle, be represented by a computer program using pseudorandom numbers. If near-perfectly independent random numbers between 0 and 1 can be produced, appropriate functions of these numbers can simulate realizations of practically any probability distribution needed in technical and scientific applications. The development of Monte Carlo methods grew out of our ability to find those functions and algorithms.

The literature on Monte Carlo methods is vast, and the methods are many. We cannot cover everything here, but the intention is to provide a brief description of some of the most important techniques and their principles. After having reviewed basic techniques for random simulation, we shall move on to three high-level categories of Monte Carlo methods: testing, inference, and optimization. We shall also present some hybrid methods, as there is often no sharp borderline between the categories.

In the following, unless otherwise stated, we will consider continuous random variables over bounded (measurable) subsets of ℛ^N and their probability densities, although many of the methods described are applicable more generally.

Methods

Basic Simulation Methods

Exact sampling of probability distributions is an important component of all Monte Carlo algorithms. Most important are the following methods (formulated for a 1dimensional space):

Algorithm 1 The “Inverse Method” for a Continuous Random Variable with a Given, Everywhere Nonzero Probability Density

Let p be a probability density, and P its distribution function:

$$ P(s)={\int}_{-\infty}^xp(s) ds, $$

(1)

and let r be a random number chosen uniformly at random between 0 and 1. Then the random number x generated through the formula

$$ x={P}^{-1}(r) $$

(2)

has probability density p.

Algorithm 2 The Box-Muller Method for Gaussian Distributions

Let r₁ and r₂ be random numbers chosen uniformly at random between 0 and 1. Then the random numbers

$$ {x}_1=\sqrt{-2\ln {r}_2}\cos\;\left(2\pi {r}_1\right) $$

(3)

$$ {x}_2=\sqrt{-2\ln {r}_2}\sin\;\left(2\pi {r}_1\right) $$

(4)

are independent and Gaussian distributed with zero mean and unit variance.

As building blocks for some of the more sophisticated algorithms we shall discuss in the coming sections, we need to mention the following two algorithms for sampling in high-dimensional spaces:

Algorithm 3 Sequential Simulation

Consider a probability density p over an N-dimensional space decomposed into a product of N univariate conditional probability densities:

$$ p\left({x}_1,{x}_2,\dots, {x}_N\right)=p\left({x}_1\right)p\left({x}_2|{x}_1\right)p\left({x}_3|{x}_2,{x}_1\right)\dots p\left({x}_N|{x}_{N-1},\dots, {x}_1\right). $$

(5)

Exact simulation of realizations from p(x₁, x₂, …, x_N) can now be performed by first generating a realization $ {\hat{x}}_1 $ from p(x₁), then a realization $ {\hat{x}}_2 $ from $ p\left({x}_2|{\hat{x}}_1\right) $, then a realization $ {\hat{x}}_3 $ from $ p\left({x}_3|{\hat{x}}_2,{\hat{x}}_1\right) $, and so on until we have a realization $ {\hat{x}}_N $ from $ p\left({x}_N|{\hat{x}}_N,\dots, {\hat{x}}_1\right) $. The point $ \left({\hat{x}}_1,\dots, {\hat{x}}_N\right) $ is now a realization from $ p\left({x}_1,\dots, {x}_N\right) $.

The above type of algorithm is widely used for simulation of image realizations where the conditional probabilities are obtained from training images (see Strebelle 2002).

The simplest way to sample a constant probability density is through a simple application of Algorithm (3) where all 1D conditional probability densities are constant functions. However, in the Markov Chain Monte Carlo algorithms we shall describe below, we need to perform uniform sampling via random walks whose design can be adapted to our sampling problem. This can be done in the following way:

Algorithm 4 Sampling a Constant Probability Density by a Random Walk

Given a uniform (constant) probability density, a sequence of random numbers U_n(n = 1, 2, …) in [0,1], and a proposal distribution q(x⁽ⁿ⁺¹⁾|x⁽ⁿ⁾) defining the probability that the random walk goes to x⁽ⁿ⁺¹⁾, given that it starts in x⁽ⁿ⁾ (possibly with q(x⁽ⁿ⁺¹⁾|x⁽ⁿ⁾) ≠ q(x⁽ⁿ⁾|x⁽ⁿ⁺¹⁾), then the points visited by the iterative random function V:

$$ V\left({\mathbf{x}}^{(n)}\right)=\left\{\begin{array}{l}{\mathbf{x}}^{\left(\mathrm{n},+,1\right)}\;\mathrm{if}\;\mathrm{U}\le \min \left[1,\frac{\mathrm{q}\left({\mathbf{x}}^{\left(\mathrm{n}\right)}|{\mathbf{x}}^{\left(\mathrm{n},+,1\right)}\right)}{\mathrm{q}\left({\mathbf{x}}^{\left(\mathrm{n},+,1\right)}|{\mathbf{x}}^{\left(\mathrm{n}\right)}\right)}\right]\\ {}{\mathbf{x}}^{(n)}\ \mathrm{otherwise}\end{array}\right., $$

(6)

asymptotically converge to a sample of the uniform distribution as n goes to infinity.

The above algorithms (1)–(4) are easily demonstrated and straightforward to use in practice. Algorithms (1) and (2) can without difficulty be generalized to higher dimensions .

Monte Carlo Testing

If we are interested in testing the validity of a parametric statistical model of, say, an inhomogeneous earth structure, we can select an appropriate function (a “statistic”) summarizing important properties of the structure, for instance, a variogram. We can now generate Monte Carlo realizations of the structure and compute values of the statistic for each realization. By comparing the distribution of the statistics with the observed (or theoretical) value of the statistics, we can decide if our choice of model is acceptable (Mrkvicka et al. 2016). The decision relies on a statistical test measuring to what extent the observed statistics coincides with high probabilities in the histogram of simulated values.

The above testing approach can be seen as a tool to determine acceptable models satisfying given observations. This is related, but in principle more general, than the Monte Carlo inference methods described in the following. In these methods, the statistical model is fixed beforehand.

Monte Carlo Inference

Inference problems are problems where we seek parameters of a given, underlying stochastic model, designed to explain observable data d. In physical sciences, technology, and economics, inference problems are often inverse problems which, in a probabilistic formulation, result in the definition of a posterior probability distribution p(x|d) describing our state of information about model parameters x after inference. Integrals of p over the parameter space can then provide estimates of event probabilities, expectations, covariances, etc. Such integrals are of the form

$$ I={\int}_{\chi }h\left(\mathbf{x}\right)p\left(\mathbf{x}\right)d\mathbf{x}, $$

(7)

where p(x) is the posterior probability density. They can be evaluated numerically by generating a large number of random, independent realizations x⁽¹⁾, …, x^(N) from p(x). The sum

$$ I\approx \frac{1}{N}\sum \limits_{n=1}^Nh\left({\mathbf{x}}^{(n)}\right). $$

(8)

is then an estimate of the integral. The fact that the integral (7) in high-dimensional spaces is more efficiently calculated by Monte Carlo algorithms than by any other method means that Monte Carlo integration has become important in numerical analysis.

For many practical inference problems, the model space is so vast, and evaluation of the probability density p(x) is so computer intensive, that full information about p(x) remains unavailable. In this case, we need algorithms that can do with values of p(x) from one point at a time. The simplest of these algorithms is the rejection algorithm (von Neumann 1951) :

The Rejection Algorithm

Algorithm 5 The Rejection Algorithm

Assume that p(x) is a probability density with an upper bound M ≥ max (p(x)). In the nth step of the algorithm, choose with uniform probability a random candidate point x_c. Accept x_c only with probability

$$ {p}_{accept}=\frac{p\left({\mathbf{x}}_c\right)}{M}. $$

(9)

The set of thus accepted candidate points is a sample from the probability distribution p(x).

The rejection algorithm is ideal in the sense that it generates independent realizations from bounded probability densities, but for distributions in high-dimensional spaces with vast areas of negligible probability, the number of accepted points is small. This makes the algorithm very inefficient, and for this reason random-walk-based algorithms were developed. The original version of this was the Metropolis Algorithm (Metropolis et al. 1953), but it was later improved by Hastings (1970):

Markov-Chain Monte Carlo (MCMC)

Algorithm 6 (Metropolis-Hastings).

Given a (possibly un-normalized) probability density p(x) > 0, a sequence of random numbers U_n(n = 1, 2, …) in [0,1], and an iterative random function V (x) sampling a constant probability density using Algorithm (4):

$$ {\mathbf{x}}^{\left(n+1\right)}=V\left({\mathbf{x}}^{(n)}\right). $$

(10)

Then the distribution of samples produced by the iterative random function W:

$$ {\mathbf{x}}^{\left(n+1\right)}=W\left({\mathbf{x}}^{(n)}\right)=\left\{\begin{array}{l}V\left({\mathbf{x}}^{(n)}\right)\qquad \mathrm{if}\, {\mathrm{U}}_{\mathrm{n}}\le \min \left[1,\frac{\mathrm{p}\left(\mathrm{V}\left({\mathbf{x}}^{\left(\mathrm{n}\right)}\right)\right)}{\mathrm{p}\left({\mathbf{x}}^{\left(\mathrm{n}\right)}\right)}\right]\\ {}{\mathbf{x}}^{(n)}\ \mathrm{otherwise}\end{array}\right., $$

(11)

will asymptotically converge to p(x) as n goes to infinity.

An extended form of this algorithm is obtained if V (x) is sampling an arbitrary probability density r(x), in which case the algorithm will sample the product distribution p(x)r(x). This was suggested by Mosegaard and Tarantola (1995) as a way of sampling the posterior distribution in Bayesian/probabilistic inverse problems, where r(x) is the prior probability density and p(x) is the likelihood. In this way, a closed-form expression for the prior was unnecessary, allowing V to be available only as a computer algorithm.

It is important to realize that the design of the proposal algorithm (4) is critical for the performance of the Metropolis-Hastings algorithm. The proposal distribution q(x⁽ⁿ⁺¹⁾|x⁽ⁿ⁾) will work efficiently only if it locally resembles the sampling distribution p(x) (except for a constant). In this way, the number of rejected moves will be minimized.

In practice, any sampling with the Metropolis-Hastings algorithm needs to run for some time to reach equilibrium sampling with statistically stationary output. The time (number of iterations) needed to reach this equilibrium is called the burn-in time , and it is usually found by experimentation. Another problem is that the algorithm is based on a random walk, and hence, sample points will not be independent when sampled closely in time. To determine the time spacing between approximately independent samples, a correlation analysis may be needed. Ways of avoiding poor sampling results were discussed by Hastings (1970).

Geman and Geman (1984) introduced the Gibbs Sampler, another variant of the Metropolis algorithm where sample points were picked along coordinate axes in the parameter space according to the conditional distribution p(x_j|x₁, …, x_j−1, x_j+1…, x_N), x_j being the parameter to be perturbed. Although this strategy gives acceptance probabilities equal to 1, it may not be practical when calculation of the conditional probability is computationally expensive .

Monte Carlo Inference for Sparse Models

The Metropolis-Hastings algorithm (6) can be formulated to allow variable dimension of the parameter vector x (Green 1995). This version of the algorithm is used in Bayesian inference when the number of parameters N is treated as an unknown (Sambridge et al. 2006), making it possible to work with sparse models, and thereby reducing the computational burden:

Algorithm 7 Reversible-Jump Monte Carlo

Let p(x, N) be a probability density over a subset of ℛ^N, augmented with its dimension parameter N. The Metropolis-Hastings algorithm (6) operating in the extended space with

$$ V\left({\mathbf{x}}^{(n)},{N}^{(n)}\right)\, =\, \left\{\begin{array}{l}\left({\mathbf{x}}^{\left(n,+,1\right)},{N}^{\left(n,+,1\right)}\right)\, \mathrm{if}\, {\mathrm{U}}_{\mathrm{n}}\le \min \left[1,\frac{\mathrm{q}\left({\mathbf{x}}^{\left(\mathrm{n}\right)},{\mathrm{N}}^{\left(\mathrm{n}\right)}|{\mathbf{x}}^{\left(\mathrm{n},+,1\right)},{\mathrm{N}}^{\left(\mathrm{n},+,1\right)}\right)}{\mathrm{q}\left({\mathbf{x}}^{\left(\mathrm{n},+,1\right)},{\mathrm{N}}^{\left(\mathrm{n},+,1\right)}|{\mathbf{x}}^{\left(\mathrm{n}\right)},{\mathrm{N}}^{\left(\mathrm{n}\right)}\right)}|\mathrm{J}|\right]\\ {}\left({\mathbf{x}}^{(n)},{N}^{(n)}\right)\ \mathrm{otherwise}\end{array}\right., $$

(12)

where the proposal distributions q(x^(q)|x^(p)) and q(x^(p)|x^(q)) are expressed through the invertible, differential mappings:

$$ {\displaystyle \begin{array}{l}{\mathbf{x}}^{(q)}=h\left({\mathbf{x}}^{(p)},{\mathbf{u}}^{(p)}\right)\\ {}{\mathbf{x}}^{(p)}={h}^{\prime}\left({\mathbf{x}}^{(q)},{\mathbf{u}}^{(q)}\right),\end{array}} $$

(13)

u^(p) and u^(q) being random vectors with generally nonuniform distributions, and where ∣J∣ is the Jacobian

$$ \left|J\right|=\left|\frac{\partial \left({\mathbf{x}}^{(p)},{\mathbf{u}}^{(p)}\right)}{\partial \left({\mathbf{x}}^{(q)},{\mathbf{u}}^{(q)}\right)}\right|, $$

(14)

will sample the extended space according to p(x, N).

This algorithm will not only sample the space of models x (with varying dimension), but also the distribution N of the dimension itself.

Sequential Monte Carlo Methods

Estimating the evolution of a dynamic system is important in science and technology, for instance, in predicting the movement of an airplane in motion, or in numerical weather prediction. An important analytical tool in this field is the Kalman filter (Kalman 1960) which iteratively provides deterministic, Bayesian, and linear updates to past predictions of system parameters. For large systems, however, the many parameters make the analytical handling of large covariance matrices computationally intractable. This led to the development of the ensemble Kalman filter (EnKF) (Evensen 1994, 2003) a Monte Carlo algorithm that avoids the covariance matrices and instead represents the Gaussian distributions of noise, prior and posterior by ensembles of realizations. A basic formulation is the following:

Algorithm 8 Ensemble Kalman Filter (EnKF)

Assume that data d at any time are related to the system parameters x through the linear equation d = Hx. If X is an n × N matrix of N realizations from the initial (prior) Gaussian distribution of system parameters, and if D is an m × N matrix with N copies of data d + e_m where d is the noise-free data and each e_m is a realization of the Gaussian noise, it can be shown that if K = CH^T(HCH^T + R)⁻¹, the columns of the matrix

$$ \hat{\mathbf{X}}=\mathbf{X}+\mathbf{K}\left(\mathbf{D}-\mathbf{HX}\right) $$

(15)

are realizations from the posterior probability density. Iterative application of these rules, where the posterior realizations in one time step are used as prior realizations in the next step, will approximately evolve the system in time according to the model equations and the uncertainties.

For more details about the implementation, see Evensen (2003).

A more recent, nonlinear, and non-Gaussian alternative to EnKF is the Particle Filter method (Reich and Cotter (2015), Nakamura and Potthast (2015) and van Leeuwen et al. (2015)). The standard particle filter is a bootstrapping technique (see below) operating in the following way :

Algorithm 9 Particle Filter

Assume that data d at any time are related to the system parameters x through the equation d = h(x). After n time steps, let $ {\mathbf{x}}_1^{(n)},{\mathbf{x}}_2^{(n)},\dots, {\mathbf{x}}_N^{(n)} $ be N initial realizations (“particles”) of the system parameters, representing the prior probability density

$$ p\left({\mathbf{x}}^{(n)}\right)=\sum \limits_{i=1}^{(n)}\frac{1}{N}\delta \left({\mathbf{x}}^{(n)}-{\mathbf{x}}_i^{(n)}\right). $$

(16)

Assume that the system is propagated forward in time by the generally nonlinear model f :

$$ {\mathbf{x}}^{(n)}=f\left({\mathbf{x}}^{\left(n,-,1\right)}\right)+{\mathbf{e}}^{(n)} $$

(17)

where e⁽ⁿ⁾ is modelization noise introduced in the n’th time step. If the likelihood function for time step n is

$$ p\left({\mathbf{d}}^{(n)},\mid, {\mathbf{x}}^{(n)}\right)={p}_{\varepsilon}\left(\mathbf{d}-\mathrm{h}\left({\mathbf{x}}^{(n)}\right)\right) $$

(18)

We can use Bayes’ rule and obtain

$$ p\left({\mathbf{x}}^{(n)}|{\mathbf{d}}^{(n)}\right)=p\left({\mathbf{d}}^{(n)}|{\mathbf{x}}^{(n)}\right)\frac{p\left({\mathbf{x}}^{(n)}\right)}{p\left({\mathbf{d}}^{(n)}\right)}\approx \sum \limits_{i=1}^n{w_i}^{(n)}\delta \left({\mathbf{x}}^{(n)}-{{\mathrm{x}}_i}^{(n)}\right) $$

(19)

where

$$ {w_i}^{(n)}=\frac{p\left({\mathbf{d}}^{(n)}|{{\mathbf{x}}_i}^{(n)}\right)}{\sum_kp\left({\mathbf{d}}^{(n)}|{{\mathbf{x}}_k}^{(n)}\right)}. $$

(20)

Iterative application of these rules, where the posterior realizations in one time step are used as prior realizations in the next step, will approximately evolve the system in time according to the model equations and the uncertainties.

Resampling (sampling with replacement from the ensemble) is often used before each time step to obtain more equally weighted, but possibly duplicated particles.

Empirical Monte Carlo

Assume that N realizations from an unknown probability distribution p(x) are available, and that we, for some reason, are unable to obtain more samples. We are interested in inferring information about a parameter of p(x), for example, the distribution of its mean. This can be accomplished by an empirical Monte Carlo technique, for example, the resampling method Bootstrapping (Efron 1993):

Algorithm 10 Bootstrapping

To infer information about a parameter η of the distribution p(x) from a set of realizations A = {x₁, …, x_N}, draw N elements from A with replacement (thereby allowing elements to be repeated) and compute η for the N elements. This process is repeated many times, each time obtaining a new value of η (if N is large). The normalized histogram for η of all these resampling experiments will be an approximation to the distribution of η.

Bootstrapping can in this way be used to evaluate biases, variances, confidence intervals, prediction errors, etc.

Monte Carlo Optimization

An important application of Monte Carlo algorithms is in the solution of complex optimization problems, for instance, in the location of the global minimum for an objective function E(x). One of the first examples of a Monte Carlo optimizer was the following modification of the Metropolis-Hastings algorithm (6) (Kirkpatrick et al. 1983) :

Simulated Annealing

Algorithm 11 (Simulated Annealing)

Given an objective function E(x), a sequence of random numbers U_n(n = 1, 2…) in [0,1], a decreasing sequence of positive numbers (“temperature” parameters) T_n → 0 for n → ∞, and an iterative random function V (x), sampling a constant probability density:

$$ {\mathbf{x}}^{\left(n+1\right)}=V\left({\mathbf{x}}^{(n)}\right). $$

(21)

Then the sample points produced by the iterative random function A:

$$ {\mathbf{x}}^{\left(n+1\right)}=A\left({\mathbf{x}}^{(n)}\right)=\left\{\begin{array}{l}V\left({\mathbf{x}}^{(n)}\right)\qquad \mathrm{if}\, {\mathrm{U}}_{\mathrm{n}}\le \min \left[1,\frac{\exp \left(-\mathrm{E}\left(\mathrm{V}\left({\mathbf{x}}^{\left(\mathrm{n}\right)}\right)\right)/{\mathrm{T}}_{\mathrm{n}}\right)}{\exp \left(-\mathrm{E}\left({\mathbf{x}}^{\left(\mathrm{n}\right)}\right)/{\mathrm{T}}_{\mathrm{n}}\right)}\right]\\ {}{\mathbf{x}}^{(n)}\ \mathrm{otherwise}\end{array}\right., $$

(22)

asymptotically converge to the minimum of E(x) as n goes to infinity.

This algorithm was inspired by the process of chemical annealing, where a crystalline material is slowly cooled (T → 0) from a high temperature through its melting point, resulting in the formation of highly ordered crystals with low lattice energy E. In each step of the algorithm, thermal fluctuations in the system are simulated by the Monte Carlo algorithm (for applications, see Mosegaard and Vestergaard (1991) and Deutsch and Journel (1994)).

It is seen that, for constant temperature parameter T, the above algorithm is actually a Metropolis-Hastings algorithm designed to sample the Gibbs-Boltzmann distribution (see Mosegaard and Vestergaard 1991)

$$ {P}_B\left(\mathbf{m}\right)=\frac{\exp \left(-\frac{E\left(\mathrm{m}\right)}{T}\right)}{Z(T)} $$

(23)

Monte Carlo Optimization-Sampling Hybrids

The use of MCMC algorithms for inference often runs into problems in high-dimensional parameter spaces. When the target probability/objective function has multiple optima, MCMC algorithms show a critical slowing-down often making the analysis computationally intractable. There exist a few methods to alleviate this problem. Common to these methods is to use a “temperature” parameter to artificially increase (or vary) the noise to broaden the sampling density to hinder entrapment in local optima during sampling.

Parallel Tempering

One method is a simple modification of simulated annealing, where E(x) = − ln (p(x)), and the decrease of T is stopped at T = 1 (see Salamon et al. (2002)). An inherent problem with simulated annealing is, however, the risk of entrapment in local minima for E(x). The parallel tempering algorithm (Geyer 1991; Sambridge 2013), which is an improvement of an older algorithm simulated tempering (Marinari and Parisi 1992; Sambridge 2013), seeks to avoid the entrapment problem by skipping the forced decrease of the “temperature” parameter T. The idea is to operate on an ensemble of models, and to include T in the parameters to be sampled:

Algorithm 12 (Parallel Tempering)

Given an ensemble x = (x₁, x₂, …x_K) of K models, and their temperatures T₁, T₂, …, T_k, all distributed according to the same probability density p over the augmented spaces (x_k, T_k), k = 1, 2, …, K. The Metropolis-Hastings algorithm (6), with the special rule for V that any perturbation of T consists of swapping values of T for two random ensemble members, asymptotically samples the joint distribution ∏_k p(x_k, T_k), and the sample of ensemble members with T_k = 1 approaches a sample from the original target distribution p(x).

If positive values of T close to T^∗ ≈ 0 are allowed in this algorithm, the ensemble members with T_k = T* will be located close to the maxima for p(x). In this way, parallel tempering can be used for optimization in a way similar to simulated annealing .

Other Approaches

Biological systems evolve by using a large population to explore many options in parallel, and this is the source of inspiration for evolutionary algorithms. An example of this is genetic algorithms (Holland 1975) where an ensemble x₁, …, x_K of K sample points is assigned a probability p(x_k) of survival (“fitness”). In a simple implementation, the population is initially generated randomly, but at each iteration it is altered by the action of three operators: selection, crossover, and mutation. Selection randomly resamples the ensemble according to p(x_k), thereby typically duplicating ensemble members with high fitness at the expense of members with low fitness, improving the average fitness of the ensemble. The crossover step exchanges parameter values between two random members, and the mutation step randomly changes a parameter of one member. Crossover and mutation attempts are only accepted according to predefined probabilities. Iterative application of the above three steps allows the algorithm to asymptotically converge to a population with high fitness values p(x_k).

Another widely used technique is the neighborhood algorithm (Sambridge 1999a, b) where sample points are iteratively generated from a neighborhood approximation to the target distribution p(x). The approximation is a piecewise constant function over Voronoi cells, centered at each of the previous sample points. The approximation to the target distribution is updated in each iteration, concentrating the sampling in multiple high-probability regions. The method will generate a distribution of points biased toward the maxima of p(x).

Neither of the two algorithms above produces samples of the probability distribution p, but if supplemented with appropriate resampling of the output, an approximate sample from p may be produced.

Summary

We have given an overview of the use of Monte Carlo techniques for inference and optimization. A main theme behind the development of many methods has been to reduce the computational workload required to solve large-scale problems. Several ways of dealing with this challenge were discussed, including Markov-Chain Monte Carlo strategies, the use of sparse models, improving numerical forecast schemes through nonparametric representations of probability distributions, and even improving ideas inherited from early works on simulated annealing. Research in Monte Carlo methods is still a prolific area, and there is no doubt that many interesting developments are waiting ahead.

Cross-References

References

Deutsch CV, Journel AG (1994) Application of simulated annealing to stochastic reservoir modeling. SPE Adv Technol Ser 2:222–227
Article Google Scholar
Efron B (1993) An introduction to the bootstrap. Chapman & Hall\CRC
Google Scholar
Evensen G (1994) Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J Geophys Res 99(C5):10,143–10,162. https://doi.org/10.1029/94JC00572
Article Google Scholar
Evensen G (2003) The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn 53(4):343–367. https://doi.org/10.1007/s10236-003-0036-9
Article Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6(6):721–741. https://doi.org/10.1109/TPAMI.1984.4767596
Article Google Scholar
Geyer CJ (1991) Markov chain Monte Carlo maximum likelihood. Interface Foundation of North America
Google Scholar
Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732. https://doi.org/10.1093/biomet/82.4.711
Article Google Scholar
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109
Article Google Scholar
Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press
Google Scholar
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45. https://doi.org/10.1115/1.3662552
Article Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680. https://doi.org/10.1126/science.220.4598.671
Article Google Scholar
Marinari E, Parisi G (1992) Simulated tempering: A new Monte Carlo scheme. Europhys Lett 19(6):451–458. https://doi.org/10.1209/0295-5075/19/6/002
Article Google Scholar
Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092
Article Google Scholar
Mosegaard K, Tarantola A (1995) Monte Carlo sampling of solutions to inverse problems. J Geophys Res 100:12431–12447
Article Google Scholar
Mosegaard K, Vestergaard PD (1991) A simulated annealing approach to seismic model optimization with sparse prior information. Geophys Prosp 39(05):599–612
Article Google Scholar
Mrkvička T, Soubeyrand S, Myllymäki M, Grabarnik P, Hahn U (2016) Monte Carlo testing in spatial statistics, with applications to spatial residuals. Spatial Stat 18:40–53, spatial Statistics Avignon: Emerging Patterns
Article Google Scholar
Nakamura G, Potthast R (2015) Inverse modeling. IOP Publishing, Bristol
Google Scholar
Reich S, Cotter C (2015) Probabilistic forecasting and Bayesian data assimilation. Cambridge University Press, Cambridge, UK
Book Google Scholar
Salamon P, Sibani P, Frost R (2002) Facts, conjectures, and improvements for simulated annealing. Society of Industrial and Applied Mathematics
Google Scholar
Sambridge M (1999a) Geophysical inversion with a neighbourhood algorithm – i. searching a parameter space. Geophys J Int 138(2):479–494. https://doi.org/10.1046/j.1365-246X.1999.00876.x
Article Google Scholar
Sambridge M (1999b) Geophysical inversion with a neighbourhood algorithm – ii. appraising the ensemble. Geophys J Int 138(3):727–746. https://doi.org/10.1046/j.1365-246x.1999.00900.x
Article Google Scholar
Sambridge M (2013) A parallel tempering algorithm for probabilistic sampling and multimodal optimization. Geophys J Int 196(1):357–374. https://doi.org/10.1093/gji/ggt342
Article Google Scholar
Sambridge M, Gallagher K, Jackson A, Rickwood P (2006) Trans-dimensional inverse problems, model comparison and the evidence. Geophys J Int 167(2):528–542. https://doi.org/10.1111/j.1365-246X.2006.03155.x
Article Google Scholar
Strebelle S (2002) Conditional simulation of complex geological structures using multiple-point statistics. Math Geol 34:1–21
Article Google Scholar
van Leeuwen P, Cheng Y, Reich S (2015) Nonlinear data assimilation. Springer, Berlin
Book Google Scholar
von Neumann J (1951) Various techniques used in connection with random digits. Monte Carlo methods. Nat Bureau Standards 12:36–38
Google Scholar

Download references

Author information

Authors and Affiliations

Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark
Klaus Mosegaard

Authors

Klaus Mosegaard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klaus Mosegaard .

Editor information

Editors and Affiliations

System Science & Informatics Unit, Indian Statistical Institute- Bangalore Centre, Bangalore, India
B. S. Daya Sagar
Insititue of Earth Sciences, China University of Geosciences, Beijing, China
Qiuming Cheng
School of Natural and Built Environment, Queen's University Belfast, Belfast, UK
Jennifer McKinley
Canada Geological Survey, Ottawa, ON, Canada
Frits Agterberg

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Mosegaard, K. (2022). Monte Carlo Method. In: Daya Sagar, B.S., Cheng, Q., McKinley, J., Agterberg, F. (eds) Encyclopedia of Mathematical Geosciences. Encyclopedia of Earth Sciences Series. Springer, Cham. https://doi.org/10.1007/978-3-030-26050-7_431-2

Download citation

DOI: https://doi.org/10.1007/978-3-030-26050-7_431-2
Received: 19 February 2021
Accepted: 19 March 2021
Published: 05 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26050-7
Online ISBN: 978-3-030-26050-7
eBook Packages: Springer Reference Earth and Environm. ScienceReference Module Physical and Materials ScienceReference Module Earth and Environmental Sciences

Publish with us

Policies and ethics

Chapter history

Latest
Monte Carlo Method

Published:

05 August 2022

DOI: https://doi.org/10.1007/978-3-030-26050-7_431-2
Original
Monte Carlo Method

Published:

28 August 2021

DOI: https://doi.org/10.1007/978-3-030-26050-7_431-1