Before transitioning into data science, Tony Yiu spent nine years in the investments industry as a quantitative researcher, where he worked on portfolio optimization, economic simulation, and built numerous forecasting models to predict everything from emerging market equity returns to household spending in retirement. He now works as a data scientist at Solovis, where he uses his experience in statistics, finance, and machine learning to design and build risk analytics software for financial institutions. Tony is also a Metis Bootcamp graduate and we’re excited to have him back with us as a contributor to the blog, where he’ll write about data science and analytics in business and industry.
Too often in the business world, we think deterministically. We plan our finances, inventories, etc. for the base case – and it usually works out alright. After all, the base case is the scenario that is most likely to unfold. But it’s also an expected value, an average, and wrapped around it is an entire distribution of alternative realities that can impact your business in radical fashion if they happen to unfold.
A better way to forecast and think about the future is probabilistically. We want to imagine the full distribution of outcomes that might unfold so that we can plan accordingly. An intuitive way to do this is via Monte Carlo simulations. Monte Carlo simulations are simulations where the variable of interest is decomposed into a set of factors. Each of these factors is then allowed to vary randomly according to their assumed statistical distribution, producing a histogram of outcomes for the variable of interest.
This histogram is an estimate of the probability distribution of our possible outcomes. It allows us to ask and answer interesting questions such as:
- What range of values will my outcome fall within 80% of the time?
- How bad is my outcome on average in the worst 10% of scenarios?
I’ve plotted a hypothetical histogram for Customer Acquisition Cost (CAC) below. We can see from the plot that while most customers can be acquired at a reasonable cost, there are some that will cost us many times the average to acquire. If we only considered simple statistics like the mean, we would probably miss out on recognizing the negative impact of these outliers. But with this data, we can design a marketing strategy that is aligned with the properties of the distribution. For example, we might build a simple model to predict who is likely to be a high CAC outlier in order to avoid wasting our scarce resources on them.
By modeling each driver of variance as a separate random variable, we can better understand how each factor individually impacts the ultimate outcome. Yes, something like linear regression can do this as well, but often, the outputs of a simulation are more intuitive, less constrained by statistical assumptions, and easier to visualize for someone not trained in statistics.
If you want to learn more about how simulations can help your business refine its strategy and operations, you can read my article here. In it, I show how you can create a Monte Carlo simulation (along with example Python code) of a digital marketing campaign. And don’t worry if you’re not in marketing; Monte Carlo simulations are extremely versatile and can be used in a broad range of industries. For example:
- Banks and investment firms can use it to model and measure the risks in their portfolios.
- Entertainment firms can use it to understand the range of potential outcomes for a new movie or television show.
- Insurance companies can use it to model the possible paths that a hurricane or tropical storm might take to better understand their risk of loss.
- A credit card firm might use it to model the probable range of call volume for its call centers so that it knows how many employees to staff each day and at what times.
The future is uncertain, and with uncertainty comes volatility. Monte Carlo simulations allow us to estimate and visualize this volatility so that our plans can account for it.
Read more of Tony's work on his Medium blog here.