When our Sr. Data Scientists aren't in the classroom teaching the intensive, 12-week bootcamps, you can find them engaged in our corporate training efforts, or working on curriculum development, or giving talks at conferences and Meetups, or writing blog posts about topics of interest, or working on data-focused passion projects...and the list goes on. This new monthly blog series will track and discuss some of their recent activities and accomplishments.
Let's start with Sr. Data Scientist Alice Zhao, a member of our growing Seattle team, who was recently selected as Geekwire's Geek of the Week. This ongoing interview series profiles "the characters of Pacific Northwest tech, science, games, innovation, and more." Zhao's impressive qualifications include having been the first-ever data scientist at Cars.com, co-founding a data science education startup, earning two degrees (B.S. in analytics and M.S. in electrical engineering) from Northwestern, and having a personal blog post on How Text Messages Change from Dating to Marriage go viral.
Below are a couple snippets from her interview with Geekwire, but we highly recommended that you read it in full here.
“I am a data scientist. I love my job because I get to use data to tell fun and compelling stories. There is more data available now than ever before, and in this field, I get to use my creativity to figure how to play with and mold a massive amount of data into something that’s never been created before. It makes me feel like an artist, in a geeky way.”
“My role models are working moms. I didn’t realize how much my own mom put into both raising me and having a full-time job until I became one myself. I am constantly amazed by people like Sheryl Sandberg, Hillary Clinton, and my manager Debbie Berebichez, who are able to balance rocking their day jobs and pouring their heart and soul into raising their kids as well."
Two of our Sr. Data Scientists gave talks and workshops at the recent Open Data Science Conference in San Francisco. Seth Weidman (pictured at left), who's on our Chicago team, traveled to the Bay Area to deliver his talk on Deep Learning from Scratch Using Python. As he put it, "Many of us have used libraries like Keras and TensorFlow to train Deep Learning models. But very few of us fully understand what is going on 'under the hood.'"
In his talk, he walked attendees through how to create Deep Neural Networks powerful enough to solve complex image classification tasks from scratch, using Python. He covered everything from coding the layers of the network using classes, to implementing the backpropagation algorithm so the layers work correctly together, and implementing a number of different neural net training optimization techniques such as Dropout, Momentum, and Weight Regularization. In the end, attendees wound up with a Jupyter notebook running a flexible deep learning framework live that could then be extended to create arbitrarily deep networks – all from scratch.
Andrew Blevins (pictured at right), a member of our San Francisco team, gave a workshop called Beyond Word2Vec: Recent Developments in Document Embedding. During, he addressed how easy it is to be dazzled by the power of Word2Vec, but noted that in real business cases, you rarely need to understand single words. In his talk, he asked, "How do we apply the power of Word2Vec to phrases, sentences, paragraphs, or entire documents?" He took the attendees through various techniques to generate useful representations of documents of indeterminate length and looked at ways of comparing methods. See Andrew's presentation slides here.
NOTE: Recordings of both talks will be added to this post once published by ODSC. Coming soon!
Sr. Data Scientist Roberto Reif, based in Seattle, recently published a new personal blog post called The Importance of Feature Scaling in Modeling. He starts it off by writing, "In data science, one of the challenges we try to address consists of fitting models to data." Throughout this instructional post, he demonstrates the challenges presented when the ranges of each feature are different and goes over how to address them by applying a normalization (scaling) technique.
Learn more about our awesome team of Sr. Data Scientists.