Demystifying Data Science Free Online Conference is July 30-31! Register Now

Sr. Data Scientist Roundup: Climate Modeling, Deep Learning Cheat Sheet, & NLP Pipeline Management

By Emily Wilson • August 21, 2018

When our Sr. Data Scientists aren't teaching the intensive, 12-week bootcamps, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.

Julia Lintern, Metis Sr. Data Scientist, NYC

During her 2018 passion quarter (which Metis Sr. Data Scientists get each year), Julia Lintern has been conducting a study looking at co2 measurements from ice core data over the long timescale of 120 - 800,000 years ago. This co2 dataset perhaps extends back further than any other, she writes on her blog. And lucky for us (speaking of her blog), she's been writing about her process and results along the way. For more, read her two posts so far: Basic Climate Modeling with a Simple Sinusoidal Regression and Basic Climate Modeling with ARIMA & Python

Brendan Herger, Metis Sr. Data Scientist, Seattle

Brendan Herger is four months into his role as one of our Sr. Data Scientists and he recently taught his first bootcamp cohort. In a new blog post called Learning by Teaching, he discusses teaching as "a humbling, impactful opportunity" and explains how he's growing and learning from his experiences and students. 

In another blog post, Herger provides an Intro to Keras Layers. "Deep Learning is a powerful toolset, but it also involves a steep learning curve and a radical paradigm shift," he explains, (which is why he's created this "cheat sheet"). In it, he walks you through some of the basic principles of deep learning by discussing the fundamental building blocks.

Side note: He'll be speaking on this very topic on August 29th, live in Seattle. You can RSVP here to attend in person or here to stream it live!

Zach Miller, Metis Sr. Data Scientist, Chicago

Sr. Data Scientist Zach Miller is an active blogger, writing about ongoing or finished projects, digging into various aspects of data science, and providing tutorials for readers. In his latest post, NLP Pipeline Management - Taking the Pains out of NLP, he tackles "the most frustrating part of Natural Language Processing," which he says is "dealing with all the various 'valid' combinations that can occur."

"As an example," he continues, "I might want to try cleaning the text with a stemmer and a lemmatizer - all while still tying to a vectorizer that works by counting up words. Well, that's two possible combinations of objects that I need to create, manage, train, and save for later. If I then want to try both of those combinations with a vectorizer that scales by word occurrence, that's now four combinations. If I then add in trying different topic reducers like LDA, LSA, and NMF, I'm up to 12 total valid combinations that I need to try. If I then combine that with 6 different models... 72 combinations. It can become infuriating quite quickly." 

Read on to learn how he handles the issue. 

_____

What were Metis Sr. Data Scientists up to last month? See here


Similar Posts

data science
Seattle Data Science Career Advice: Landing a Job in The Emerald City

By Marybeth Redmond • April 23, 2019

Let’s face it: today's job market is confusing – perhaps even more so in data science, where job titles are all over the map, needed skill sets aren’t always clear in job descriptions, interviews are famously intense, and so forth. In this post, Metis Seattle Career Advisor Marybeth Redmond explains how to effectively navigate the tricky waters of the job search, particularly in the Seattle area.

data science
Sr. Data Scientist Roundup: How WaveNet Works, Art + Data Science, Upcoming Conference Talks, & More

By Emily Wilson • June 19, 2019

When our Sr. Data Scientists aren't teaching the intensive, 12-week bootcamps or corporate training courses, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.

data science
Sr. Data Scientist Roundup: Managing Essential Curiosity, Creating Function Factories in Python, and Much More

By Emily Wilson • February 22, 2019

This blog series tracks and discusses the recent activities and accomplishments of our talented Sr. Data Scientists. This month, read advice from the team on how to manage your own data team's curiosity, how to democratize data for all, how to create function factories in Python, and more.