MONDAY: Beginner Python & Math for Data Science Course Starts Enroll Now

Sr. Data Scientist Roundup: Scoping & Designing Projects, Intro to PyTorch with NLP

By Emily Wilson • May 07, 2019

When our Sr. Data Scientists aren't teaching the intensive, 12-week bootcamps or corporate training courses, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.

Cliff Clive, Metis Sr. Data Scientist (Bootcamp)

By launching his new website, Data Science MVP,Metis Sr. Data Scientist Cliff Clive is on a mission to help promote better engineering practices among new data scientists. In his first post, he explains that the site's title stands for Minimum Viable Product, which is "a first draft of a data science project in which we've put together enough of our workflow to read in some data, put it into a workable format for our tools to handle, train a basic model, and calculate some preliminary results. The results can be absolute garbage, and that's okay. An MVP is an engineering effort, meant to provide us with a pipeline to quickly develop new iterations of our work, and to produce a baseline model that we can use to benchmark our more serious findings."

In his most recent post, OSEMN is Awesome, but AOSEMN is Awesomer, he covers the importance of spending time carefully designing a data science project before diving into the data. OSEMN stands for Obtain, Scrub, Explore, Model, Interpret, and it's a widely adopted framework for data science. 

"Workflows are effective because they provide direction that keeps us moving through each stage of a project," writes Clive. "When we adopt them, we minimize the time spent wondering about the next thing we should do." Check out the post in full to why design is crucial to any effective workflow, how to write a good abstract, and how to build and iterate. 

(Additionally, you can also find a project template associated with his blog on GitHub.)

Adam Wearne, Metis Sr. Data Scientist (Bootcamp)

In his first post on Medium, Metis Sr. Data Scientist Adam Wearne provides a comprehensive Intro to PyTorch with NLP.

"When it comes to options for deep learning within the Python ecosystem, there are TONS of choices," he writes. "Keras is a great choice for starting out and for quickly developing and iterating on models, pure Tensorflow is amazingly fast, and with the recent advent of Tensorflow 2.0, will only become more awesome. However, over the past few years, there has been a huge surge in popularity for Pytorch...I’d like to introduce some of the main Pytorch concepts, and apply them to a common task in natural language processing: Named Entity Recognition (NER)."

Want more? There's plenty of it. Read the full post here

Damien Martin, Metis Sr. Data Scientist (Corporate Training)

In a previous blog post, Metis Sr. Data Scientist Damien Martin discussed the benefits of business leaders upskilling their employees in order to investigate trends within data, thus helping to find high-impact projects. When everyone on your team is thinking about business problems at a strategic level, all will be able to add value based on insight from each person’s specific job function. In turn, having a data literate and empowered workforce allows your data science team to work on projects rather than ad hoc analyses.

In this followup post, Martin breaks down the process of Scoping a Data Scientist Project, which is what happens after someone on your team has identified an opportunity (or a problem) through which data science can likely help. 

Read Damien's post here.


What were our Sr. Data Scientists up to last month? Find out here.

Similar Posts

data science
Data Scientist Roundup: How to Make a Seaborn Lineplot, Python and Data Literacy Videos, & More

By Emily Wilson • September 01, 2020

When our Data Scientists aren't teaching the intensive 12-week bootcamps or corporate training courses, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.

data science
Our Top 10 Most-Read Blog Posts of 2020

By Carlos Russo • December 22, 2020

Year after year, we enjoy sharing posts that feature our alumni stories, data science and analytics thought leadership from our Data Scientists, guest posts, and so much more. Here we’ve gathered the top 10 most-read posts of 2020 for you to enjoy.

data science
How to Become a Data Scientist

By Carlos Russo • April 16, 2021

Data science jobs are plentiful in today’s job market. Read on to learn about what data scientists do, what kinds of data science-related jobs are available, and how to become a data scientist.