Demystifying Data Science: Exploring the Intersection between Medicine and Data Science
June 08, 2017
This post was written by Sameh Saleh, Metis Data Science Bootcamp graduate and current Resident Physician in Internal Medicine at UT Southwestern Medical Center.
As I sit in front of my computer screen, it shows a duality of my past experiences and my future in medicine. On one side, I am typing this article and on the other, I am tweaking a custom-built random forest algorithm that personalizes alarm thresholds in the intensive care unit (ICU). The most impactful future strides in patient advocacy and informed clinical care will hinge on our ability to utilize personalized patient information. Through sharing my story, I hope to inspire other clinicians-in-training to seriously consider the value of a data science education in their lifelong medical endeavors.
Entering medical school, I had some programming and quantitative knowledge from my undergraduate education. I also had worked and published with great mentors and teams on research projects that utilized data science and machine learning in basic and clinical science capacities. In medical school, I started constantly noticing ample untapped opportunities to actualize clinical data science as the nexus of medicine, both to help physicians make more informed, safer clinical decisions and to allow patients to take control of their health. But I soon realized that despite my experiences, I lacked a firm foundation and understanding of the core skills of data science. I wanted to be able to approach any data science problem and be fully confident in my skills, in my creativity to approaching the problem, and in my ability to communicate the results.
I had been taking classes and exams for years, but I knew the true value of learning was through working on real-life projects and data to understand the natural problems that arise with them. I considered interning at cutting-edge startups like Enlitic (uses machine learning to automatically interpret medical imaging), Omada Health (uses digital therapeutics to help prevent and manage diabetes), and Sano (uses biometric sensors to capture and transmit blood chemistry data continuously, specifically glucose), but I was concerned that I would neither be able to build a broad, well-rounded data science foundation nor successfully align the demands of a medical school schedule with that of an ever-changing startup.
Only after scouring the internet did I discover data science bootcamps. I was skeptical at first as I was determined to avoid the rigidity of the classroom and to have something to show for my efforts. I then came across Metis, an accredited 12-week data science training program that requires completion of multiple real-world projects and grounds those projects in the teaching of theory and understanding of data science concepts. It also provides consistent career advice and resources to bolster connections and future work potential. Metis appeared to provide the perfect opportunity to blend education, real projects, research, and networking.
After going through Metis's competitive application process and getting accepted, my challenge was now getting Metis approved for medical school credit. At the University of Virginia School of Medicine, during the fourth year, we are allotted up to 12 weeks of research which requires a UVA physician supervisor sign off on a detailed research plan. Two research mentors at UVA graciously agreed to function as my supervisors. Initially, I got only 4 weeks approved before I flew from Charlottesville to San Francisco. But after six proposals for three different projects, I was finally awarded 12 weeks of credit and could fully capitalize on the experience. Reflecting back, Metis would have likely been perfect during the summer between my first and second year if I had come across the opportunity sooner.
At Metis, I gained a strong foundation of the theory and quantitative groundwork from experienced instructors, one of whom had worked extensively in clinical data science. I also exponentially grew my network of healthcare data scientists and created a robust LinkedIn presence. Most importantly, I completed five data science projects, some individually and some in collaboration with colleagues from vastly different career backgrounds.
For one project, I applied advanced classification machine learning techniques to predict mortality in the ICU from a time series database of 40,000 patients and visualized the model performance through d3.js (a data visualization language). The model was comparable or outperformed industry standards (like the SAPS II score) without using previous health information of the patient.
I then approached the same data set and problem from a different angle using customized natural language processing (NLP) tokenization and topic modeling to process notes of patients and build a logistic regression model that predicts mortality.
For my final project, I built a partnership with another major hospital system at UCSF (facilitated through my own research advisor at UVA) that would likely not have been actualized otherwise. I developed the aforementioned custom-built random forest algorithm to personalize heart rate alarm thresholds and thereby, reduce alarm fatigue in the intensive care unit in the hospital and improve patient safety.
Terms like "big data" and "precision medicine" have been circulating the healthcare sphere for the past decade. Underlying the buzz is a paradigm shift towards data analytics that even while in its infancy is transforming medicine like it has revolutionized marketing, finance, and politics. Large amounts of patient data enable improvement of diagnostic accuracy and efficiency and focus evaluation and treatment of individual patients rather than the incomplete “one size fits all” model. Wearables allow for more holistic, longitudinal monitoring of diseases, whether acute or chronic, and can facilitate prevention of disease. Algorithm development can reduce hospital readmissions, preempt decompensation in the hospital, and cut healthcare costs.
We are here as health professionals to deliver high-quality, safe, satisfying care at the lowest possible cost. How then can we achieve that mission without this new paradigm shift into the data era? And how can this paradigm shift happen with clinicians on the outside looking in?
Clinicians will soon have to become intimate with the data, analytics tools, and technology platforms in order to shape the tools they are going to use in everyday practice. The premise of the clinician serving only as the end-user of these tools comes with its many pitfalls: confusion and inundation with electronic health records (EHRs), notification fatigue, and a lack of awareness of the breadth and application of technology tools, to name a few. Throughout development, clinicians are needed in order to ask the right questions, understand and relay clinical workflows, and provide insight into the application. Hence, clinicians need to have at least a basic understanding of statistics, probability, programming, and data analysis tools in order to be able to collaborate and communicate with others in data science.
The experience I gained at Metis has already been invaluable in my medical career. I have continued to work on my final bootcamp project with UCSF. I noted my experience on my Internal Medicine residency applications and the bootcamp came up as a talking point and a strength in the majority of my interviews. I see my future in medicine involving indelibly connected roles as both a practicing Internal Medicine physician and a clinical data scientist to advocate for my patients both on individual and systemic bases. That mission shone through in my portfolio.
Starting next month, I will have the privilege of working as an Internal Medicine resident at UT Southwestern in Dallas, and I believe that my experience in clinical data science at Metis was a strong catalyst. The path to a healthcare data science foundation might not yet be paved for physicians from different educational backgrounds, but I hope that my experience provides insight into one robust and effective path.
Interested in learning more about the Metis Data Science Bootcamp? Check it out!
In this month's edition of the Made at Metis blog series, we're highlighting two recent student projects that focus on the intersection between transportation and data science. One project is a video-based car detector to improve safety for city cyclists, and the other presents a way to better forecast hourly Uber demand across New York City neighborhoods.
While working as a software engineer at a consulting agency, Sravanthi Ponnana automated computer hardware ordering processes for a project with Microsoft, attempting to identify existing and/or potential loopholes in the ordering system. But what she discovered underneath the data caused her to rethink her career. Read her story.
As you enter the bootcamp, your computer screen is like a blank canvas, waiting to be covered with the messy excitement of ideas and projects, failures and successes. Projects are at the core of our Data Science Bootcamp curriculum and the philosophy surrounding it. A robust portfolio, demonstrating a wide understanding of tools and theories, along with a solid grasp of needed industry skills, is the central piece of the overall "why our graduates get hired" puzzle. See some final project examples here.