Which Bootcamp is Right for Your Career Goals? Explore Programs

Understanding the Limitations of Your Model and Its Assumptions

By Tony Yiu • April 14, 2020

This post by Data Scientist Tony Yiu is a summary of a longer blog he published on his Medium account, which you can read in full here.

Models provide necessary simplifications to a complex world. They reduce real-world phenomena into a set of key features and relationships that allow us to explain, analyze, and sometimes even predict.  But there is a cost to these powerful benefits. Every model comes with several key assumptions, and if these assumptions are not met, the output of the model can become unreliable or even downright dangerous.

In the finance industry, an investment’s return is famously assumed to be normal and the volatility (aka standard deviation) of those returns are assumed to be a good approximation of the investment’s risk.  These assumptions flow into nearly every metric that portfolio managers use to structure their portfolio or that banks use to measure and hedge their risk. Even after the recession in 2008, when supposed six-sigma-type losses (based on the risk measures at the time) to real estate-related investments and loans wreaked havoc across the industry and almost brought down the global economy, we continue to equate volatility to risk and assume that asset returns are normally distributed.

The truth is that when markets are calm and returns are smooth, which is most of the time, asset returns (whether you measure them daily, weekly, monthly, etc.) are normally distributed.  So it’s easy to be fooled into thinking, “it works most of the time, so it’s good enough for me.”

What this type of thinking is missing is the following question: “Does the model work when I really need it to work?”  

A risk model needs to work during times of great stress because it’s supposed to answer the following questions:

  • - In a worst-case scenario, how much will I lose?
  • - Where am I likely to get hit the hardest?

It’s in attempting to answer these questions that the assumption of normality (and the reliance on historical data) fails us.  We end up understating both the frequency and magnitude of the worst-case scenario.

This is just one example of why it’s important to understand the assumptions of your model as well as the implications of not conforming to those assumptions.  It doesn’t necessarily render your model unusable, but it means you need to build contingencies or even alternative models to hedge against these shortcomings. 

_____

Before transitioning into data science, Tony Yiu spent nine years in the investments industry as a quantitative researcher, where he worked on portfolio optimization, economic simulation, and built numerous forecasting models to predict everything from emerging market equity returns to household spending in retirement. He now works as a data scientist at Solovis, where he uses his experience in statistics, finance, and machine learning to design and build risk analytics software for financial institutions. Tony is also a Metis Bootcamp graduate and we’re excited to have him back with us as a contributor to the blog, where he’ll write about data science and analytics in business and industry


Similar Posts

business resource
VIDEO: 21st Century Lending with AI

By Carlos Russo • March 18, 2021

At the recent Ai4 Finance Summit, Metis Sr. Data Scientist Javed Ahmed spoke on a panel about 21st Century Lending with AI along with other industry leaders. Watch the recording here.

business resource
VIDEO: An AI4 Panel Discussion on The State of AI in Banking

By Carlos Russo • September 23, 2020

Metis Sr. Data Scientist Javed Ahmed recently took part in a panel discussion about The State of AI in Banking during an online Ai4 event. He and the other panelists talked about upskilling, challenges related to COVID-19, and more. Watch the recorded panel discussion here.

business resource
Expand Your Data Science Toolkit with Data Engineering

By Carlos Russo • April 16, 2021

Big data is growing exponentially. To keep up with it, data engineering — a discipline focused on collecting, funneling, and organizing big data into accessible data pipelines — is in urgent demand. Data scientists and other data professionals can fill the gap by extending their capabilities into the world of data engineering with the Data Engineering for Data Scientists Course by Metis Corporate Training. In this course, data science professionals will learn advanced programming, database management, distributed computing, and cloud engineering.