Monte Carlo Simulation — a practical guide

A versatile method for parameters estimation. Exemplary implementation in Python programming language.

Published in

Towards Data Science

7 min readJan 31, 2022

Image by 15299 from Pixabay modified by author

Monte Carlo Simulation (or Method) is a probabilistic numerical technique used to estimate the outcome of a given, uncertain (stochastic) process. This means it’s a method for simulating events that cannot be modelled implicitly. This is usually a case when we have a random variables in our processes.

In this article, I give you a brief background of this technique, I show what steps you have to follow to implement it and, at the end, there will be two examples of a problems solved using Monte Carlo in Python programming language.

A bit of history

Monte Carlo Simulation, as many other numerical methods, was invented before the advent of modern computers — it was developed during World War II — by two mathematicians: Stanisław Ulam and John von Neumann. At that time, they both were involved in the Manhattan project, and they came up with this technique to simulate a chain reaction in highly enriched uranium. Simply speaking they were simulating an atomic explosion.

Solving the “neutron diffusion” model was too complex to describe and to solve explicitly, especially keeping in mind they had only IBM punch-card machines or later a computer called ENIAC. During his stay in hospital Stanisław Ulam was killing boredom by playing cards and then a new idea struck him. After returning to work he shared his novel idea with a colleague from the laboratory John von Neumann. The development of a new solving method got a codename “Monte Carlo”. Method this was based on random sampling and statistics. Thanks to it both mathematicians were able to speed up the calculation process, make incredibly good predictions and deliver useful and highly needed at that time results to the project.

Stanisław Ulam (left) and John von Neumann (right) worked together during the Manhattan project; Images by AlexAntropov86 from Pixabay and from Wikimedia (credits: LANL) modified by author

While working in Los Alamos National Laboratory Stanisław Ulam published in 1949 the first unclassified document describing Monte Carlo Simulation.

2. Applications now

Currently, due to the ease of implementation and available high computing power this technique is widely used across various industries. Let us look on some documented use cases.

Health:

Finance:

“Sensitivity estimation of conditional value at risk using randomized quasi-Monte Carlo”
“Complex system in finance: Monte Carlo evaluation of first passage time density function”

Production:

Transport:

“Three-dimensional (3D) Monte-Carlo modeling for UAS collision risk management in restricted airport airspace”

Engineering/Science:

Design and optimization of graphene quantum dot-based luminescent solar concentrator using Monte-Carlo simulation
“Risk analysis of an underground gas storage facility using a physics-based system performance model and Monte Carlo simulation”
“Nonlinear robust fault diagnosis of power plant gas turbine using Monte Carlo-based adaptive threshold approach”

3. Backbone of the Monte Carlo Method

The core concept behind the Monte Carlo Simulation is a multiple random sampling from a given set of probability distributions. These can be of any type, e.g.: normal, continuous, triangular, Beta, Gamma, you name it.

To use this technique, you have to follow four main steps:

Identify all input components of the process and how do they interact e.g., do they sum up or subtract?
Define parameters of the distributions.
Sample from each of the distributions and integrate the results based on point 1.
Repeat the process as many times as you want.

During the running of this simulation, your resultant parameter (e.g. cost or risk) will converge toward the Normal Distribution even the source distributions can be different. This is the effect of the Central Limit Theorem and that is one of the reasons why this technique became immensely popular in various industries.

4. Implementation in Python — basics

Monte Carlo Simulation can be easily implemented using any programming language. In this case we will use Python. NumPy library will be very handy here as it has multiple most popular probability distributions implemented. For example:

numpy.random.normal — the Normal Distribution
numpy.random.triangular — a triangular distribution
numpy.random.uniform — a uniform distribution
numpy.random.weibull — a Weibull distribution

5. Example 1

Let’s assume we have a process constructed from 3 stages (X1, X2, X3). Each one has an average duration (5, 10 and 15 minutes) which vary following the normal distribution and we know their variance (all 1 minute).

Notation of the normal distribution with parameters

We want to know what is the probability that the process will exceed 34 minutes?

This example is so trivial that it can be solved manually what we do later to validate the Monte Carlo result.

We know all the individual components so let’s define the relationship between them (it’s additive):

Now we can start coding. The single-component can be represented with a short function:

The Monte Carlo simulation code shown below uses this function as a basic block. The number of iterations for this use case is set at 10 000 but you can change it. The last section of a code checks the probability of exiting the limit of 34 minutes (once again it uses the sampling technique).

After running the following code we get the following answer but it will vary every time you run the code:

Probability of exceeding the time limit:  1.035 %

Now we can plot the histogram of the estimated parameter (time). We clearly see it follows the normal distribution.

Let’s verify our results with hand calculations.

As a result the total time follows the normal equation with parameters:

In order to calculate the probability we have to find the z-score first.

P-value we read now from the z-score table. A right tailed probability we calculate as:

As you remember our Monte Carlo simulation gave us the result of 1.05% which is pretty close.

6. Example 2

In this example let’s assume we want to assemble three blocks inside a container of a given width. Nominal dimensions are shown on the picture below. We see that by design there is a nominal gap of 0.5mm.

Three blocks inside a container; Image by author

However, the real dimensions of these three blocks and a container can vary due to technological reasons. For the sake of demonstration let’s assume that none of these variations follow the normal distribution. Three blocks will follow triangular distributions shown below and a container’s dimensions spread will follow an uniform distribution in a range of +/-0.1 mm.

Dimensions distributions for three blocks X1, X2 and X3; Image by author

Now, by simply calculating the extreme values we can see that in the worst scenario blocks have 17mm and a container has a width of only 16.4mm meaning, in this case, we cannot fit them all together.

The question is: what is the probability that we won’t be able to fit all the blocks into a container?

In this case relationships between blocks look like this:

By modifying the previous code we obtain a function to sample the triangular distribution. The same we can do to obtain the uniform distribution sampling function.

A modified core code for the MC simulation:

After running above code we get the answer in order of:

Probability of not fitting the blocks:  5.601 %

After checking the mean and standard deviation we can say even more about the expected gap dimension:

The mean gap width will be 0.33mm with a standard deviation of 0.2mm.

Now let’s plot the histogram of the estimated parameter (gap width) to show that it follows the normal distribution even though none of the input distribution is of this type.

Gap width histogram following the normal distribution; Image by author

You may now wonder how the result will vary with increase number of samples. To check that see the graph below showing the gap width estimation with 95% confidence interval, depending on the sample size (from 100 to 7000 samples):

An estimated gap width with 95% confidence interval; Image by author

From this graph it’s evident that the mean of the estimated value doesn’t change significantly but the spread decreases with the number of samples. This means that with a new run of the simulation bigger samples give you smaller results spread. However, at some point adding more samples doesn’t help anymore.

7. Summary

As you’ve seen Monte Carlo is basically a very simple idea yet very powerful. After reading this article I hope you understand the core concept of the this method, when to use it and how to implement it in Python programming language.

Monte Carlo Simulation — a practical guide

A versatile method for parameters estimation. Exemplary implementation in Python programming language.

Written by Robert Kwiatkowski

More from Robert Kwiatkowski and Towards Data Science

Gradient Descent Algorithm — a deep dive

The Gradient Descent method lays the foundation for machine learning and deep learning techniques.

A New Coefficient of Correlation

What if you were told there exists a new way to measure the relationship between two variables just like correlation except possibly…

The Math Behind Neural Networks

Dive into Neural Networks, the backbone of modern AI, understand its mathematics, implement it from scratch, and explore its applications

Backpropagation — chain rule and PyTorch in action

A simple guide from a theory to the implementation in Pytorch.

Recommended from Medium

An Introduction and Step-by-Step Guide to Monte Carlo Simulations

Since I started using Monte Carlo Simulations for forecasting instead of using estimations with our teams, I’ve gotten several questions…

Mastering Gradient Descent: Math, Python, and the Magic Behind Machine Learning

Implementing a Python class to auto-compute gradient values like PyTorch tensor

Lists

Coding & Development

Predictive Modeling w/ Python

Practical Guides to Machine Learning

ChatGPT

Monte Carlo Simulation: Basics

In the realm of decision-making, uncertainty is a constant companion. Whether you’re a financial analyst predicting stock prices, an…

Mastering Monte Carlo Simulation for Data cience

Monte Carlo Simulation (or Method) is a powerful numerical technique used in data science to estimate the outcome of uncertain processes…

Resampling Distributions in python, Bootstrap, Jackknife, Monte Carlo

A probability distribution shows us the probabilities of different outcomes occurring. This distribution is useful because it describes the…

Intro to Reinforcement Learning: Monte Carlo to Policy Gradient

This post is an intro to reinforcement learning, in particular, Monte Carlo methods, Temporal Difference Learning, Deep Q-learning, Policy…