Q&A with Nathan Grossman, Data Scientist at Wells Fargo and Metis Intro to Data Science Instructor
By Metis • September 06, 2019
On Wednesday evening, we hosted a live Ask Me Anything session on Zoom, during which attendees asked questions of Nathan Grossman, Data Scientist at Wells Fargo and Instructor of our upcoming Introduction to Data Science course. He answered questions about his career, what it takes to be a successful data scientist, and what students should expect of his upcoming course.
Below, read a recap of the hour-long chat – and if you're interested in taking Nathan's course, be sure to enroll soon! It kicks off Monday, September 9th and runs on Monday and Thursday evenings (6:30 - 9:30 PT) through October 21st. Learn more and enroll here.
Do you think this course is a good way to prepare for the Metis Data Science Bootcamp? Yes, I think this course is excellent preparation if you're thinking about applying to the bootcamp. Number one, it will give you some vision as to what data science really is. You may have some ideas, but you'll have a better idea after taking the course. It will also give you some vision as to what Metis is like as a teaching organization. So if you like the teaching style, and you think that data science is an interesting field, then this course will help inform your decision as to whether or not to move on to the bootcamp.
Number two, I think the course is a good predecessor to the bootcamp because it can act as a bit of pre-work for you. This course is basically a snapshot of what the bootcamp itself is going to look like. It's for someone who's already seen some Python and has some level of coding background. The bootcamp is a large investment, so this course allows you to see what it's going to look like, at least in terms of some of the basics covered in the bootcamp.
How can I break into data science and get that first job in the field?
First of all, you need to learn something about how to do data science, so taking a course like this one is a great way to get started. There are other ways, too. There are now Masters of Data Science programs at universities, or data science bootcamps (like the one offered at Metis) that you can choose from.
Data science really just comes down to analyzing data and using mathematical tools and knowledge and programming tools and knowledge. That's very generic, but it can be applied to many different fields, so if you've been in the workforce for some time already, you can add on to the skills and knowledge you already have.
For example, data science can be applied to medicine or pharmacy in terms of sequencing DNA, it can be applied to finance or insurance in terms of predicting the likelihood of certain outcomes that you know may require an insurance payment. It can be applied to marketing in terms of trying to understand what product is likely to sell well and to what subset of the population. There's a lot of different applications of data science. Fusing what you already know with all your new knowledge will result in what's called domain expertise, which is just another way of saying that you know the nuts and bolts of a field, which is extremely valued by employers.
So when you think about wanting to break into data science, try and leverage your prior experience even though that experience probably doesn't relate to data science per se. If you try to get into the same industry where you actually know something about the nuts and bolts of the business, you have a much better chance of breaking in.
Did you go into data science with some sort of domain expertise?
I started out my career as a more traditional electrical engineer doing signal processing, and that really comes down to doing applied statistics and a bit of programming. I really loved the experience I had as an electrical engineer but eventually wanted to move on.
I did not go directly from doing signal processing to doing data science at a bank. Instead, I went from doing signal processing and then using that same math skillset for data science, but with a telecommunications twist, because as a signal processing engineer, I had been doing work for telecommunications applications.
So for me, telecom applications were the continuity of domain or field, and that helped me bridge the gap from being an electrical engineer to being a data scientist. Once I broke into the data science area, then I was able to move to another domain within data science, and that's' where I am now, in FinTech.
Why is data science such a sought-after field right now?
There's multiple causes that came together at the same time to drive this. The most basic thing is that there's more need for data science now because there's just more data – by which I mean that almost everything we do now is online or in some way electronic. If you go back 50 years, people went to the grocery store, generally paid in cash, and the cash register was a mechanical object. There wasn't much of a record of that transaction. Now, whether you buy online from Amazon or buy at your local store, there's a lot of information being recorded about every transaction. When you go online, every single thing that you click through is recording your interests.
Those are examples of marketing data, but you can apply that same idea to FinTech, where I work, or medicine, or many other fields. There are reams and reams of data that did not exist, and were not collectible, just a few years ago.
So the challenge is how do you extract some insight from that data, and it just so happens that a lot of the tools for extracting that insight have matured to a point where they're much more useful today than they were just a few years ago.
When you mention tools, which ones do you mean?
Some of the tools are what I would call computer science-ish or software engineering-ish, meaning that computers are getting better at dealing with large quantities of data. That's one side of it. The other tools are a little bit more mathematics or statistics or artificial intelligence-ish.
For example, using fraud detection in FinTech, the algorithms can differentiate between a normal transaction and a fraudulent transaction. These types of algorithms have become a lot more effective and usable in the last few years. And incidentally, part of what we teach in the Intro to Data Science course are some of those algorithms.
Speaking of the course, do you learn how to scrape data in the Intro course?
It's not a topic explicitly covered in the lectures, however, that doesn't mean you can't learn to do the basics. Data scraping in principle is not that complicated. It simply means getting data from a source where it's not laid out in a convenient fashion and putting it into some convenient format.
What do I mean by convenient format? Let's say we're talking about an Excel table with rows and columns, where each column has a name. Let's say it's data about a bunch of Olympic athletes. It might have height, weight, age, etc. That's a very convenient format.
What would be an inconvenient format would be just a bunch of biographies written in plain text about all these different athletes. The text might say, Jane Doe is an Olympic volleyball player, and she's a female, she's 107 years old, truly an outlier less with longevity, etc. This written biography would be great for humans to read through, but machines are not as comfortable with that. So in this case, scraping would be taking all that data about Jane Doe and putting it into your Excel table.
Again, there's no lecture explicitly on this in the class, but you can still learn about it. There are a number of tools out there and you just have to learn how to use them. If you take the class and you do a project (not mandatory but recommended!), you can get exposure to a scraping tool, and I'd be happy to suggest some to you, depending on the nature of the data you're looking to scrape.
What are the benefits of completing a project in the course? I strongly encourage that you do a project during the course. It can be anything you choose and in whatever domain you choose. Presumably, you choose a topic that you find personally interesting. Like anything else, in data science, you learn best by doing.
Also, if you do a data science project, you have a piece of work that you can then show to prospective employers. Another recommendation to all aspiring data scientists is whenever you do any kind of project or build any kind of model, put it on your personal GitHub. Make it so people can easily see your full portfolio and your body of work.
Do I need a Ph.D. to be a data scientist? Or what about a Master's or a STEM degree?
It really depends on what kind of data science you want to do. For example, if you want to be doing research on new cutting-edge machine learning algorithms at Google, it really would help to have a Ph.D. But the vast majority of data science jobs are not like that, so for those, no, you don't need a Ph.D.
Do you need a master's? It could help, but is it absolutely necessary? No. If you're looking on job boards, most say that at least a Bachelor's degree is required, and then they might say a Master's or Ph.D. would be helpful. So you might get extra credit, but suffice to say that no, you don't need a Ph.D. or a Master's – you can do it with just a Bachelor's (in most cases).
What do you find most exciting about data science?
It's not so much the theoretical research; it's the more practical applications. I didn't always feel this way back when I was a traditional engineer. I liked the theoretical work more than practical work. Now I've changed – maybe it's just because I'm getting older. Maybe I've just become a grumpy old man.
But really, I think part of it goes back to the reason why I think data science is so cool, which is that it gives me the opportunity to use the mathematics and statistics skillset that I developed as an engineer. But instead of using that skill set to design an individual product, I get to use it to help an entire organization make better decisions. I just feel like I have much more leverage or impact now.
And almost by definition, if you want to impact an entire organization, what you're thinking about are concrete applied practical problems rather than abstract questions. The good news for most people out there who don't have a Ph.D. is that the practical side is the side that does not require a Ph.D.
I am unable to attend on Mondays. Are the classes recorded?
Yes. The classes are all recorded, so if you have to miss a session, you can catch up. Also, you'll have access to all class recordings for 6 months after the course ends, so you can review the lectures for a while after it's all said and done.
Learn more about the Intro to Data Science course and enroll here.
On Thursday, September 26th, you're invited to join us for a live (and free!) webinar on Data-Driven Approaches to Forecasting with Metis Sr. Data Scientist Javed Ahmed, designed for business leaders, data science managers, and decision-makers seeking to understand how data-driven approaches can improve forecasting and planning.
Watch a recording of Metis Sr. Data Scientist Kimberly Fessel's PyCon tutorial on scraping the web. She covers the breadth and depth of web scraping, from HTML basics through pipeline methods to compile entire datasets.
On Friday and Saturday (June 28-29) in NYC, Open Data Science Conference (ODSC) is hosting a two-day event called Immersive AI Training and Career Development, at which three of our Sr. Data Scientists will give talks and/or host workshops (+ one of our bootcamp graduates who now teaches our Beginner Python & Math course!) In this post, get more info on each.