Understanding the Limitations of Your Model and Its Assumptions

By Tony Yiu • April 14, 2020

This post by Data Scientist Tony Yiu is a summary of a longer blog he published on his Medium account, which you can read in full here.

Models provide necessary simplifications to a complex world. They reduce real-world phenomena into a set of key features and relationships that allow us to explain, analyze, and sometimes even predict.  But there is a cost to these powerful benefits. Every model comes with several key assumptions, and if these assumptions are not met, the output of the model can become unreliable or even downright dangerous.

In the finance industry, an investment’s return is famously assumed to be normal and the volatility (aka standard deviation) of those returns are assumed to be a good approximation of the investment’s risk.  These assumptions flow into nearly every metric that portfolio managers use to structure their portfolio or that banks use to measure and hedge their risk. Even after the recession in 2008, when supposed six-sigma-type losses (based on the risk measures at the time) to real estate-related investments and loans wreaked havoc across the industry and almost brought down the global economy, we continue to equate volatility to risk and assume that asset returns are normally distributed.

The truth is that when markets are calm and returns are smooth, which is most of the time, asset returns (whether you measure them daily, weekly, monthly, etc.) are normally distributed.  So it’s easy to be fooled into thinking, “it works most of the time, so it’s good enough for me.”

What this type of thinking is missing is the following question: “Does the model work when I really need it to work?”  

A risk model needs to work during times of great stress because it’s supposed to answer the following questions:

  • - In a worst-case scenario, how much will I lose?
  • - Where am I likely to get hit the hardest?

It’s in attempting to answer these questions that the assumption of normality (and the reliance on historical data) fails us.  We end up understating both the frequency and magnitude of the worst-case scenario.

This is just one example of why it’s important to understand the assumptions of your model as well as the implications of not conforming to those assumptions.  It doesn’t necessarily render your model unusable, but it means you need to build contingencies or even alternative models to hedge against these shortcomings. 


Before transitioning into data science, Tony Yiu spent nine years in the investments industry as a quantitative researcher, where he worked on portfolio optimization, economic simulation, and built numerous forecasting models to predict everything from emerging market equity returns to household spending in retirement. He now works as a data scientist at Solovis, where he uses his experience in statistics, finance, and machine learning to design and build risk analytics software for financial institutions. Tony is also a Metis Bootcamp graduate and we’re excited to have him back with us as a contributor to the blog, where he’ll write about data science and analytics in business and industry

Similar Posts

business resource
Javed Ahmed Discusses Ethical Risk and Bias at Ai X West 2020

By Shaunna Randolph • October 28, 2020

Ai X West, part of Open Data Science Conference (ODSC) West will be held in a virtual format this year on October 28, and one of our Sr. Data Scientists, Javed Ahmed is all set to present. Here, read what he'll cover and how you can register to watch him live.

business resource
VIDEO: Recorded Talk - How Machine Learning is Changing Finance with Javed Ahmed

By Carlos Russo • August 20, 2020

Watch a recording of Metis Sr. Data Scientist Javed Ahmed's talk on How Machine Learning is Changing Finance at the new Wake Forest University Financial Services and Fintech Hub.

business resource
Scoping Data Science Projects

By Damien Martin • July 07, 2021

In February, Metis Sr. Data Scientist Damien Martin wrote a post on how to foster a data literate and empowered workforce, which allows your data science team to then work on projects rather than ad hoc analyses. In this post, he explains how to carefully scope those data science projects for maximum impact and benefit.