Introduction
I knew of Monte Carlo simulations in my work with planning cellular networks using radio planning software. These software tools are specialised systems used to aid cellular network design, site selection and capacity modelling. The latter is often implemented using Monte Carlo simulations to model the behaviour of users, their geographic locations within the radio cell and their services. Simplistically speaking, when using these tools you generally tick a few boxes, enter some values and hit ‘run’. Whilst I have used a number of radio planning systems, my knowledge of how these statistical models worked below the surface was sketchy. I could tell someone what a Monte Carlo simulation was used for but I had no idea how it worked, so I thought it might be interesting to try some examples using R.
Definition
Monte Carlo methods cover a broad range of applications and refer to the use of algorithms employing random samples to derive a numerical solution for deterministic problems (i.e. those in which randomness plays no role). These methods can be used to derive the outcome of any problem that has uncertainty in terms of its input variables. By the law of large numbers the average of the results of repeated simulations should converge on the expected value as the number of iterations is increased.
I read a number of definitions online and in several textbooks and frankly I wasn’t much the wiser on the topic. It wasn’t until I played around with a few examples that it made a little more sense. The example provided on the main wikipedia entry seemed like a fairly interesting application, so I tried my hand at reproducing it using R.
Calculating the Value of Pi
In this example, a Monte Carlo simulation is used to calculate the value of pi, using simple geometry and randomly generated points. Consider a unit square (i.e. a square with an area equal to one) containing a quadrant of a circle with radius equal to 1 and an area equal to $\frac{\pi.r^2}{4} = \frac{\pi}{4}$. It follows that the ratio of their areas is thus $\frac{\pi}{4}$.
There are several ways of plotting a circle using R and I chose to do it using some familiar equations and plotly
. First I created a sequence for the angles in the quadrant, and then calculated the values of $x$ and $y$.
|
|
Once I had created the data frame, the plot was simple enough. The circle is drawn using a scatter plot and the square is added in using the shapes()
function with appropriate boundary values.
|
|
If points are uniformly plotted within the unit square, then the proportion of the points inside the quadrant compared to the total number of points allows one to approximate the value of $\pi$, by observing that:
$$\frac{\Sigma points_{inside}}{\Sigma points_{total}} \approx \frac{\pi}{4}$$
Distributing the points randomly within the unit square is where the Monte Carlo method comes into play. Points are allocated an $(x,y)$ coordinate with $0 < x \leq 1$ and $0 < y \leq 1$. In R, this can be done using the runif()
function. Points inside the quadrant are coloured blue and those outside are coloured red. The approximate value of $\pi$ can then be determined for the number of samples $n$.
|
|
Even with only 1000 points, the approximation of $\pi$ is correct to two significant digits. Increasing the number of samples $n$ generally improves the accuracy of the estimation, but in some cases the individual results may be worse. The real value in this method is to run the simulation multiple times and then average the results.
|
|
Running the simulation 100 times and plotting the result for each simulation shows how the estimates can vary considerably. I tabulated the results for the first 10 simulations to demonstrate this effect.
Simulation | Approximation of Pi |
---|---|
1 | 3.180 |
2 | 3.128 |
3 | 3.132 |
4 | 3.176 |
5 | 3.144 |
6 | 3.064 |
7 | 3.176 |
8 | 3.152 |
9 | 3.156 |
10 | 3.208 |
|
|
Figure 3: Estimations of $\pi$ (n = 1000, 100 Simulations)
In the preceding plot, the mean estimate and the value of $\pi$ overlap, so plotting the rolling mean of the estimations is a bit more helpful. It shows how the approximation converges towards the value of $\pi$ as the simulation count increases.
Figure 4: Rolling Mean of Estimations of $\pi$ (n = 1000, 100 Simulations)
Finally, I set $n$ = 10,000 and simulations $s$ = 10,000 to see how this improved the result. The final approximation was 3.141523, which is accurate to 5 significant digits.
Figure 5: Rolling Mean of Estimations of Pi (n = 10000, 10000 Simulations)
Conclusion
This post has provided a simple example of how to perform and visualise a Monte Carlo simulation using R. Monte Carlo simulations are used in many domains in engineering and science and I hope to revisit this with some other applications.
Other Reading
Wikipedia: https://en.wikipedia.org/wiki/Monte_Carlo_method
Multiple Examples: https://www.countbayesie.com/blog/2015/3/3/6-amazing-trick-with-monte-carlo-simulations
Manufacturing Example: https://www.programmingr.com/monte-carlo-simulation-in-r/