FIU Data Science Bootcamp Application Deadline is Oct. 3 - Apply Now

What is a Monte Carlo Simulation? (Part 2)

By Zachariah Miller • January 30, 2018

This is part 2 of a series by Metis Sr. Data Scientist Zach Miller dedicated to investigating how Monte Carlo can be a great tool. Part 1 introduces the concept of Monte Carlo, and in part 3 (coming soon), he'll try to outsmart a casino using Monte Carlo techniques.


How do we work with Monte Carlo in Python?

A great tool for doing Monte Carlo simulations in Python is the numpy library. Today we'll focus on using its random number generators, as well as some traditional Python, to set up two sample problems. These problems will lay out the best way for us think about building our simulations in the future. Since I plan to spend the next blog talking in detail about how we can use MC to solve much more complicated problems, let's start with two simple ones:

  1. If I know that 70% of the time I eat chicken after I eat beef, what percentage of my overall meals are beef?
  2. If there really was a drunk guy randomly walking around a bar, how often would he make it to the bathroom?

To make this easy to follow along with, I've uploaded some Python notebooks where the entirety of the code is available to view and there are notes throughout to help you see exactly what's going on. So click on over to those, for a walk-through of the problem, the code, and a solution. After seeing how we can setup simple problems, we'll move on to trying to defeat video poker, a much more complicated problem, in part 3. After that, we'll investigate how physicists can use MC to figure out how particles will behave in part 4, by building our own particle simulator (also coming soon).

What is my average dinner?

The Average Dinner Notebook will introduce you to the idea of a transition matrix, how we can use weighted sampling and the idea of using a large amount of samples to be sure we're getting a consistent answer.

Will our drunk friend make it to the bathroom?

The Random Walk Notebook will get into deeper territory of using a detailed set of rules to lay out the conditions for success and failure. It will teach you how to break down a big chain of motions into single calculable actions, and how to keep track of winning and losing in a Monte Carlo simulation so that you can find statistically interesting results.

So what did we learn?

We've gained the ability to use numpy's random number generator to extract statistically significant results! That's a huge first step. We've also learned how to frame Monte Carlo problems such that we can use a transition matrix if the problem calls for it. Notice that in the random walk the random number generator didn't just choose some state that corresponded to win-or-not. It was instead a chain of steps that we simulated to see whether we win or not. On top of that, we also were able to convert our random numbers into whatever form we needed, casting them into angles that informed our chain of motions. That's another big part of why Monte Carlo is such a flexible and powerful technique: you don't have to just pick states, but can instead pick individual motions that lead to different possible outcomes.

In the next installment, we'll take everything we've learned from these problems and work on applying them to a more complicated problem. In particular, we'll focus on trying to beat the casino in video poker.


Read Part 1 of Zach's series on Monte Carlo here. Want to hear Zach discuss data science live? He's hosting an online event February 15th on Recommendation Engines and How They Work. To receive an invitation, apply to our data science bootcamp by February 12th.

Similar Posts

data science
Python Guide: Tutorial For Beginners

By Adam Wearne • July 28, 2021

Welcome to a brief introduction to Python. In this article, we'll provide an overview of the Python language, some of its many use cases, how to install Python on your computer, and how to use Python.

data science
Misleading Graphs: Manipulating the Y-Axis

By Roberto Reif • April 06, 2020

One of the most commonly used charts for data visualization is the bar chart. But too often, the starting value of the y-axis is intentionally modified to skew our interpretation of the chart and the data. In this post, see examples and learn how to readily identify this issue.

data science
Learn Machine Learning in 6 Months

By Zachariah Miller • May 24, 2021

I came across a question on Quora that boiled down to: "How can I learn machine learning in six months?" I started to write up a short answer, but it quickly snowballed into a huge discussion of the pedagogical approach I used and how I made the transition from physics nerd to physics-nerd-with-machine-learning-in-his-toolbelt to data scientist. Here's a roadmap highlighting major points along the way.