Paul Trowbridge is the instructor of our upcoming Live Online Statistical Foundations for Data Science & Machine Learning part-time professional development course, which will run from January 22nd - February 28th on Monday and Wednesday evenings from 6:30 - 9:30 pm EST. View full course details and enroll here.
Paul Trowbridge believes in the importance of knowing how things work. What makes them go? What's going on underneath the exterior? And why should you care to find out?
After receiving advanced training in statistics, demography, and sociology from the University of Washington and Rutgers University, Trowbridge has worked in applied fields such as fMRI, epidemiology and public health, international relations, urban planning, and micro-simulation modeling. He's taught statistics, data science, and data visualization at New York University's School of Professional Studies and is all set to teach our upcoming Statistics Foundations course.
In a recent Q&A, he discussed his career, why he thinks it's so important to have a statistical foundation when approaching data science work, and much more.
The intent of your upcoming Statistical Foundations course is to expose students to common statistical issues and teach them how to avoid statistical fallacies. What about the course makes you most excited to teach it?
Introducing students to first principals underlying common data science methods. By teaching the underlying material concerning probability, estimation, and hypothesis testing, students gain a deeper knowledge of the principals involved in their work as data scientists and consequently can bring a deeper insight to applied data science problems. I am always excited to see students understand and become confident producing their own solutions via first principals, as opposed to simply quoting output they may not fully understand.
Your course will be taught Live Online. What benefits do you think there are to that format?
The live online format allows students to take the course from remote and distributed locations. This is a big convenience factor for students. The format is a live, synchronous, format allowing for dynamic interaction and feedback from the instructor and allows for student questions and answers. The live format also facilitates student collaboration and peer-to-peer engagement.
By the end of the course, students will understand many of the principles underlying machine learning and data science. How important do you think it is for data scientists to understand these inner workings?
It is very important for two reasons. Modern computational facilities release researchers and practitioners from having to rely on methods that impose undue restrictions or impose assumptions too strong or inappropriate to a given problem at hand. Furthermore, when investigating a novel problem, because the problem is novel, there may not be existing solutions. In this case, practitioners will need to develop their own solutions, from first principals. In both cases, implementing custom-tailored statistical methodology requires knowledge of the underlying principals. The course teaches these underlying principals. Moreover, even in cases where practitioners aren't developing novel methodology, the deeper understanding of the inner workings of the applied methods allows for deeper insight into data science problems and allows for richer conclusions to be drawn and deeper insights culled. It is important in terms of gaining deeper insight into data science problems as well as being able to custom-tailor methodologies to unique problems.
You've personally worked in fields spanning fMRI, epidemiology and public health, international relations, urban planning, and micro-simulation modeling. If you are at liberty to share, what are some projects you've worked on recently that you're particularly proud of?
When working on the fMRI project, I introduced random effects models to capture subject-specific dependencies in a repeated measure experimental context. Also, I introduced multivariate visualization techniques such as multi-dimensional scaling and principals components to visualize relationships in high-dimensional data sets.
Whose work inspires your own?
The faculty at the University of Washington, and particularly the faculty involved with the Center for Statistics and the Social Sciences definitely established my basic approach to applied data analysis projects and influence my approach to statistical analysis broadly. Additionally, many projects in contemporary data visualization and information design engage problems in data science. Fernanda Viegas, Ben Fry, Jen Lowe, the Onformative studio, and Periscopic all produce thought-provoking visual data engagements that have strong ties to contemporary data science problems.
How do you stay up-to-date in a quickly evolving field?
Read. Definitely read as much as I can. Attend lots of presentations and see what people are currently working on. Just staying active and connected in the field one is constantly introduced to new and exciting work being done. Service opportunities in the field are also excellent opportunities to see, engage and keep up-to-date on the field.
Enroll here to learn from Paul. Want to try out the Live Online format before enrolling?