Dec 1: Winter Bootcamp Application Deadline. Apply Now In Just 10 Mins

Sr. Data Scientist Roundup: Linear Regression 101, AlphaGo Zero Analysis, Project Pipelines, & Feature Scaling

By Emily Wilson • January 11, 2018

When our Sr. Data Scientists aren't teaching the intensive, 12-week bootcamps, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments. 

In our November edition of the Roundup, we shared Sr. Data Scientist Roberto Reif's excellent blog post on The Importance of Feature Scaling in Modeling. We're excited to share his next post now, The Importance of Feature Scaling in Modeling Part 2

"In the previous post, we demonstrated that by normalizing the features used in a model (such as Linear Regression), we can more accurately obtain the optimum coefficients that allow the model to best fit the data," he writes. "In this post, we will go deeper to analyze how a method commonly used to extract the optimum coefficients, known as Gradient Descent (GD), is affected by the normalization of the features."

Reif's writing is incredibly detailed as he eases the reader through the process, step by step. We highly recommend you take the time to read it through and learn a thing or two from a gifted instructor. 

Another of our Sr. Data Scientists, Vinny Senguttuvan, wrote an article that was featured in Analytics Week. Titled The Data Science Pipeline, he writes on the importance of understanding a typical pipeline from start to finish, giving yourself the ability to take on an array of responsibility, or at the very least, understand the entire process. He uses the work of Senthil Gandhi, Data Scientist at Autodesk, and his creation of the machine learning system Design Graph, as an example of a project that spans both the breadth and depth of data science. 

In the post, Senguttuvan writes, "Senthil Gandhi joined Autodesk as Data Scientist in 2012. The big idea floating in the corridors was this. Tens of thousands of designers use Autodesk 3D to design products ranging from gadgets to cars to bridges. Today anyone using a text editor takes for granted tools like auto-complete and auto-correct. Features that help the users create their documents faster and with less errors. Wouldn’t it be fantastic to have such a tool for Autodesk 3D? Increasing the efficiency and effectiveness of the product to that level would be a true game-changer, putting Autodesk, already the industry leader, miles ahead of the competition."

Read more to find out how Gandhi pulled it off (and for more on his work and his approach to data science, read an interview we conducted with him last month). 

Data Science Weekly recently featured a blog post from Sr. Data Scientist Seth Weidman. Titled The 3 Tricks That Made AlphaGo Zero WorkWeidman writes about DeepMind's AlphaGo Zero, a program that he calls a "shocking breakthrough" in Deep Learning and AI within the past year. 

"...not only did it beat the prior version of AlphaGo — the program that beat 17-time world champion Lee Sedol just a year and a half earlier — 100–0, it was trained without any data from real human games," he wries. "Xavier Amatrain called it 'more [significant] than anything…in the last 5 years' in Machine Learning."

So, he asks, how did DeepMind do it? His post provides that answer, as he gives an idea of the techniques AlphaGo Zero used, what made them work, and what the implications for future AI research are. 

Sr. Data Scientist David Ziganto created Linear Regression 101, a three-part blog series starting with The Basics, proceeding to The Metrics, and rounding out with Assumptions & Evaluation

Ziganto describes linear regression as "simple yet surprisingly powerful." In these three instructional posts, he aims to "give you a deep enough fluency to effectively build models, to know when things go wrong, to know what those things are, and what to do about them."

We think he does just that. See for yourself! 

What were Metis Sr. Data Scientists up to last month? See here

Similar Posts

data science
Course Report Webinar: How is Python Used for Data Science?

By Metis • September 21, 2020

During a recent webinar with Course Report, Metis Sr. Data Scientist Kimberly Fessel discusses how Python is used for data science, how much Python you should know before starting a data science bootcamp, and more. Watch it here.

data science
Data Scientist Roundup: How to Make a Seaborn Lineplot, Python and Data Literacy Videos, & More

By Emily Wilson • September 01, 2020

When our Data Scientists aren't teaching the intensive 12-week bootcamps or corporate training courses, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.

data science
Data Scientist Roundup: The Importance of Data Literacy in Business, Classification & Regression Trees, & Much More

By Emily Wilson • July 30, 2020

When our Data Scientists aren't teaching the intensive 12-week bootcamps or corporate training courses, they're working on a variety of other projects. This monthly blog series tracks and discusses some of their recent activities and accomplishments.