Trent Hauck is a Senior Data Scientist at Zymergen and the instructor of our upcoming Live Online Introduction to Data Science part-time professional development course, which will run from January 22nd - March 1st on Monday and Thursday evenings from 6:30 - 9:30 pm PST. View full course details and enroll here.
In his role as a Senior Data Scientist at the biotech startup Zymergen, Trent Hauck builds data products that help biologists and other scientists improve their decision outcomes. In fact, all of his professional pursuits have a similar end-goal: to help clients/students/readers reach desired data science outcomes.
Aside from his full-time role at Zymergen, he's a data science instructor and the author of two books, Instant Data Intensive Apps with pandas How-to and Scikit-Learn Cookbook. He's routinely looking for new ways to communicate and teach others about data science while continually pursuing new information for his own sake in an effort to stay abreast of a quickly evolving field.
Read below to find out who inspires his work, what resources he uses to stay up-to-date, what he's looking forward to most about the upcoming Live Online course, and more.
You've been a data scientist for a while now, working at places like Zymergen, Zulily, and others. When you think back to the start of your career, what foundational elements do you think were most important to your success? What do you think all emerging data scientists need to learn and know in order to get moving in the right direction?
It's a bit of an overgeneralization, but being effective as a data scientist can be boiled down to understanding the context of particular problems (for example, the business needs) and having the know-how to deliver solutions to said projects. With respect to technical skills, there's a continuous cycle of reading new material to learn about techniques, then using them to solidify an understanding of how they can be applied in practice. With that in mind, it's fundamental to find 'right-sized' projects early in your career that let you cycle between understanding a particular problem and finding solutions. The faster the cycle, and an appropriate amount of dis-similarity between projects, can help a data scientist who is early in their career become effective quickly.
I'm slightly biased toward the software engineering aspects of DS, so with respect to what is essential to learn, I'd focus on learning the APIs of a few integral packages, and use those APIs as a scaffold for how to expand depending on where the path leads. For example, scikit-learn's API is very easy to follow and moreover can serve as a way to use and understand linear regression, but then it's simple to plug in ridge regression and compare the output. This also works for tools required to produce tools, for example, understanding how to deliver an API in a popular package (e.g. flask in Python) will help you learn the minimum for turning a model into an application.
You're all set to teach the upcoming Live Online Intro to Data Science course. What about the course makes you excited to teach it?
I'm very excited about teaching people with a diverse skill set the tools of a data scientist. Data science is certainly exciting on its own, but the true value in the skills comes from using data science to augment other activities. For example, an accountant with a CPA and the ability perform statistical analysis on complex datasets is likely more valuable in today's world than someone with purely DS skills or purely accounting. This can apply to tons of other professions.
I teach the class to put people in a position to synthesize new ideas coming to their domain and data science knowledge, and it's very rewarding to see that thinking begin to bud during the course.
You've already written two books - Instant Data Intensive Apps with pandas How-to, and Scikit-Learn Cookbook. Any future book plans in the works? Or if not – if you were to choose a topic to dive into and write about today, what would it be and why?
We just released the 2nd edition of the Scikit-Learn Cookbook, so no immediate plans for new work. If I was to work on a topic next, it would probably be on the practical applications of embedding/hidden layers learned through training neural networks. There are many exciting applications where these layers can be used to further understanding of the underlying process, while also providing effective features for downstream tasks, such as classification.
If you are at liberty to share, what are some projects you've worked on recently that you're particularly proud of?
I'm particularly proud of releasing the 2nd edition of the Scikit-Learn Cookbook, as mentioned. I enjoy teaching live, but also really enjoy writing as it helps me sharpen my ideas and understanding of the topics. With respect to the first question, this is another task I'd recommend for early career folks – even if there are no plans for sharing it.
Otherwise, I'll have to remain silent :).
Whose work inspires your own?
There are so many that I'll focus on a few areas. Early in my career learning how to be an effective analyst, Andrew Gelman's books and writing on multivariate data analysis have always stuck with me. Relatedly, I have a special place in my heart for probabilistic generative modeling, so the work by David Blei, Matt Hoffman, and Hanna Wallach are particularly interesting. More recently, the work on implicit modeling (especially in a probabilistic context) has been very interesting, and in that light, Ian Goodfellow and Dustin Tran have produced personally impactful work in those areas.
How do you stay up-to-date in a quickly evolving field?
I go back to the philosophy that reading raises one's ceiling, and doing raises one's floor. Therefore, I spend a lot of time reviewing lectures, reading academic research, books, etc. It's important then, to actually attempt these techniques in order to gain a deeper understanding of how they work and their trade-offs. The other piece of advice on keeping up-to-date is realizing you can't be strong in everything all the time, so having effective strategies to acquire the necessary knowledge at the time it needs to be applied and having a steady base is critical.
Last but not least, in your bio on the Metis site, you mention you can't say no when someone asks you to coffee. What's your go-to coffee order?"
My favorite question – I typically like to keep it simple with a pour over of whatever beans the particular coffee shop has at the time. That said, if there's a specialty drink, I can't say no to that either. For instance, I'm currently enjoying a Cafe con Leche at a shop that specializes in Cuban style coffee drinks.
Enroll here to learn from Trent. Want to try out the Live Online format pre-enrollment? Register for a free sample class on 1/10. [And please take note – special New Year's pricing expires 1/12. Get $350 off the course price if you enroll by then!]