This post was written by Metis Sr. Data Scientist Kevin Birnbaum.
Five years ago, I was on the verge of graduating with a bachelor’s degree in applied mathematics, and I was uncertain about my future career path. While asking around about additional skills I should acquire to launch my career, a friend encouraged me to hone my skills in Python, predicting (correctly) just how widespread and impactful the language would become.
A recent survey from Burtch Works confirms that my friend was ahead of his time. Whereas in past studies, those with Bachelor’s and Master’s degrees showed stronger support for SAS (while Ph.D. holders routinely favored R, and increasingly, Python), this year’s survey shows that all groups now prefer Python.
My path to Python wasn’t necessarily a direct one, but my learning curve is on trend with others in the industry, according to Burtch Works, which highlights the use and popularity of Python, SAS, and R over a 6-year period.
“Python has more than doubled its share of the vote since its introduction [to the study] in 2016 – from 20% to 41%. R squeaked by SAS to narrowly claim second place with 30% of the vote over SAS’ 29%.”
Now, I'm immersed in Python every day as a Sr. Data Scientist at Metis, where I teach data science-related topics using Python to companies around the world, many of whom hire us to transition their analytics teams from SAS to Python. Because Python is user-friendly and flexible, it allows for greater productivity, faster innovation, larger community, and fantastic libraries.
My Transition from R to Python
As the Burtch Works survey shows, SAS and R are still preferred (even if by slight margins) by some working in particular industries like economics and social sciences. Personally, R helped me launch a career in forecasting at Munchkin, a global baby products company. At the time, my primary function was to provide product forecasts for future sales. Historically, most work was being done in Excel without any modeling beyond moving averages, and R has some fantastic packages for forecasting and time series modeling to get me started.
Not long after, I began getting my first exposure to data science via a master’s in Data Science and Information program I was attending at night. The program was taught in both Python and R. For a class in advanced time series analysis, I was taught in R, and this reinforced my use of R for creating forecasts at the time.
However, as my fluency in Python increased, I found myself drawn to Python for most tasks. I found it much easier to write – and even more importantly as I was consistently improving my skillset – to read, review and adjust. I found that the support community I was surrounding myself with was increasingly sticking with Python as well, and outside of pure statistics, it was much easier for me to use Python to automate much of my work.
Eventually, Python came to dominate my day-to-day, and in order to incorporate the forecasts I made using R, I had Python scripts that would take the preprocessed data, dump input into R and then take output back into Python for final touches. Eventually, I even dropped this step and translated my forecasts into Python as I became familiar with all the packages available to accomplish similar goals.
My Python knowledge helped me to rise the ranks at Munchkin, where I eventually established the first data and analytics team, and strongly encouraged my employees who only knew R to learn Python.
Python Now Used More Within Financial Services
From personal experience, the results of this Burtch Works’ study are a no brainer. The ease of use, readability, and the great community-building around Python align with the continued upward trajectory of its use. Additionally, Python being so widely used by data scientists has made it so it’s one of the leading languages used for working with data at scale.
One aspect of the study that I found particularly interesting was the shift in language preference within Financial Services, largely based on the fact that Financial Services companies are now allowing the use of open source tools like R and Python.
“We’ve...seen several major financial services employers, which began to transition away from SAS a few years ago, complete their transition to open source tools over the past year, which has made the surge towards Python even more apparent than before,” writes Burtch Works, which then goes on to detail that Python surged from 19% of the vote in 2017 to 28% in 2018, and this year Python leaped again to 41% to capture first place among professionals in financial services firms.
Heidi Kalish, a Burtch Works recruiter who specializes in data science and analytics roles within financial services, said, “There has definitely been a huge shift with all of my financial services clients. Currently, all of the roles I’m working on are looking for experience with Python or R. Even if these organizations still have legacy tools, I think they understand that in order to keep up with the market and land top talent, they’d be wise to look at open-source tools. When talking to my candidates, especially those that deal with unstructured data, their tool of choice is most commonly Python.”
As the overall survey results indicate, no matter the industry or sector, the most important aspect of data science and predictive analytics is that the field is constantly evolving. This should not be intimidating, but rather, very exciting! It means the consistent introduction to new specialties and opportunities for growth, and it means that collaboration and being plugged into blogs, webinars, conferences, and meetups are a big part of thriving in the profession.
With much of the field so clearly shifting towards Python, it’s time to take my friend’s clairvoyant advice from years ago and jump on the Python bandwagon with career growth and opportunities in mind.
Learn more about Sr. Data Scientist Kevin Birnbaum here.