data science

*This is part 2 of a series by Metis Sr. Data Scientist Zach Miller dedicated to investigating how Monte Carlo can be a great tool. Part 1 introduces the concept of Monte Carlo, and in part 3 (coming soon), he'll try to outsmart a casino using Monte Carlo techniques.*

*_____*

A great tool for doing Monte Carlo simulations in Python is the numpy library. Today we'll focus on using its random number generators, as well as some traditional Python, to set up two sample problems. These problems will lay out the best way for us think about building our simulations in the future. Since I plan to spend the next blog talking in detail about how we can use MC to solve much more complicated problems, let's start with two simple ones:

- If I know that 70% of the time I eat chicken after I eat beef, what percentage of my overall meals are beef?
- If there really was a drunk guy randomly walking around a bar, how often would he make it to the bathroom?

To make this easy to follow along with, I've uploaded some Python notebooks where the entirety of the code is available to view and there are notes throughout to help you see exactly what's going on. So click on over to those, for a walk-through of the problem, the code, and a solution. After seeing how we can setup simple problems, we'll move on to trying to defeat video poker, a much more complicated problem, in part 3. After that, we'll investigate how physicists can use MC to figure out how particles will behave in part 4, by building our own particle simulator (also coming soon).

The Average Dinner Notebook will introduce you to the idea of a transition matrix, how we can use weighted sampling and the idea of using a large amount of samples to be sure we're getting a consistent answer.

The Random Walk Notebook will get into deeper territory of using a detailed set of rules to lay out the conditions for success and failure. It will teach you how to break down a big chain of motions into single calculable actions, and how to keep track of winning and losing in a Monte Carlo simulation so that you can find statistically interesting results.

We've gained the ability to use numpy's random number generator to extract statistically significant results! That's a huge first step. We've also learned how to frame Monte Carlo problems such that we can use a transition matrix if the problem calls for it. Notice that in the random walk the random number generator didn't just choose some state that corresponded to win-or-not. It was instead a chain of steps that we simulated to see whether we win or not. On top of that, we also were able to convert our random numbers into whatever form we needed, casting them into angles that informed our chain of motions. That's another big part of why Monte Carlo is such a flexible and powerful technique: you don't have to just pick states, but can instead pick individual motions that lead to different possible outcomes.

In the next installment, we'll take everything we've learned from these problems and work on applying them to a more complicated problem. In particular, we'll focus on trying to beat the casino in video poker.

_____

*Read Part 1 of Zach's series on Monte Carlo here. Want to hear Zach discuss data science live? He's hosting an online event February 15th on Recommendation Engines and How They Work. To receive an invitation, apply to our data science bootcamp by February 12th.*

data science
##### A Beginner’s Guide to Object Detection

By Kimberly Fessel • September 18, 2019

data science
##### New: Advancing Women in Data Science Scholarship in Seattle & Chicago

By Metis • July 12, 2019

data science
##### 3 Ways to Improve Your Data Science Communication Skills

By Lara Kattan • August 10, 2019