MONDAY: Beginner Python & Math for Data Science Course Starts Enroll Now

Simulating Business Outcomes With Monte Carlo Simulations

By Tony Yiu • February 20, 2020

Before transitioning into data science, Tony Yiu spent nine years in the investments industry as a quantitative researcher, where he worked on portfolio optimization, economic simulation, and built numerous forecasting models to predict everything from emerging market equity returns to household spending in retirement. He now works as a data scientist at Solovis, where he uses his experience in statistics, finance, and machine learning to design and build risk analytics software for financial institutions. Tony is also a Metis Bootcamp graduate and we’re excited to have him back with us as a contributor to the blog, where he’ll write about data science and analytics in business and industry

Too often in the business world, we think deterministically. We plan our finances, inventories, etc. for the base case – and it usually works out alright. After all, the base case is the scenario that is most likely to unfold. But it’s also an expected value, an average, and wrapped around it is an entire distribution of alternative realities that can impact your business in radical fashion if they happen to unfold.

A better way to forecast and think about the future is probabilistically. We want to imagine the full distribution of outcomes that might unfold so that we can plan accordingly. An intuitive way to do this is via Monte Carlo simulations. Monte Carlo simulations are simulations where the variable of interest is decomposed into a set of factors. Each of these factors is then allowed to vary randomly according to their assumed statistical distribution, producing a histogram of outcomes for the variable of interest.

This histogram is an estimate of the probability distribution of our possible outcomes. It allows us to ask and answer interesting questions such as:

- What range of values will my outcome fall within 80% of the time?

- How bad is my outcome on average in the worst 10% of scenarios?

I’ve plotted a hypothetical histogram for Customer Acquisition Cost (CAC) below. We can see from the plot that while most customers can be acquired at a reasonable cost, there are some that will cost us many times the average to acquire. If we only considered simple statistics like the mean, we would probably miss out on recognizing the negative impact of these outliers. But with this data, we can design a marketing strategy that is aligned with the properties of the distribution. For example, we might build a simple model to predict who is likely to be a high CAC outlier in order to avoid wasting our scarce resources on them.

By modeling each driver of variance as a separate random variable, we can better understand how each factor individually impacts the ultimate outcome. Yes, something like linear regression can do this as well, but often, the outputs of a simulation are more intuitive, less constrained by statistical assumptions, and easier to visualize for someone not trained in statistics.

If you want to learn more about how simulations can help your business refine its strategy and operations, you can read my article here. In it, I show how you can create a Monte Carlo simulation (along with example Python code) of a digital marketing campaign. And don’t worry if you’re not in marketing; Monte Carlo simulations are extremely versatile and can be used in a broad range of industries. For example:

Banks and investment firms can use it to model and measure the risks in their portfolios.

- Entertainment firms can use it to understand the range of potential outcomes for a new movie or television show.

- Insurance companies can use it to model the possible paths that a hurricane or tropical storm might take to better understand their risk of loss.

- A credit card firm might use it to model the probable range of call volume for its call centers so that it knows how many employees to staff each day and at what times.

The future is uncertain, and with uncertainty comes volatility. Monte Carlo simulations allow us to estimate and visualize this volatility so that our plans can account for it.


Read more of Tony's work on his Medium blog here.

Similar Posts

business resource
Javed Ahmed Discusses the Competition Between Banks and Tech Companies in WSJ Article

By Shaunna Randolph • September 24, 2020

Metis Corporate Training Senior Data Scientist Javed Ahmed was quoted in the Wall Street Journal discussing the pressure banks experience from fintech and big tech companies.

business resource
Expand Your Data Science Toolkit with Data Engineering

By Carlos Russo • April 16, 2021

Big data is growing exponentially. To keep up with it, data engineering — a discipline focused on collecting, funneling, and organizing big data into accessible data pipelines — is in urgent demand. Data scientists and other data professionals can fill the gap by extending their capabilities into the world of data engineering with the Data Engineering for Data Scientists Course by Metis Corporate Training. In this course, data science professionals will learn advanced programming, database management, distributed computing, and cloud engineering.

business resource
Scoping a Data Science Projects

By Damien Martin • July 07, 2021

In February, Metis Sr. Data Scientist Damien Martin wrote a post on how to foster a data literate and empowered workforce, which allows your data science team to then work on projects rather than ad hoc analyses. In this post, he explains how to carefully scope those data science projects for maximum impact and benefit.