Michelle Gill is a Senior Data Scientist at Metis in New York City, where she's currently co-instructing the bootcamp.
Take home coding exercises are a common element of the data science interview process, particularly for entry level positions. Typically, these exercises are sent to a candidate early in the interview process and involve several hours of work, and the candidate is generally expected to complete them within a week.
Many companies feel these exercises help them evaluate a candidate’s mastery of a preferred computational toolkit. Unfortunately, for many candidates seeking to land their first data science position, these exercises can be a source of frustration and stress, particularly if this stage of the interview process is a common sticking point.
One of our goals at Metis is to train individuals for career transitions into data science through completion of our 12-week data science bootcamp, which includes preparation for all stages of the job search process. Based on careful analysis of these outcomes and discussions with industry partners, we have an understanding of what goes into a successful take home exercise. This knowledge has been distilled into the tips below, which can help ensure this part of the job hunt is successful and as stress-free as possible.
Read and Plan
The first step is to read the directions – not once, but multiple times. This may seem like an obvious piece of advice, but it can be easy for the busy job seeker to misread or misunderstand a given question.
Assuming the exercise doesn’t have a time limit that starts when accessed, the directions should be read the first time when the exercise is received. This helps with estimating the required amount of work and allows time to brainstorm possible approaches. We recommend candidates then read the directions a second time before beginning the exercise and a third time before submitting. It is easy to misread instructions and multiple passes can help prevent this common mistake.
It is also important to start the exercise early and plan multiple work sessions. Do not assume this can be completed in a single session the day before it’s due. The pressures of time and exhaustion can (and do) cause careless errors and oversight.
Finally, do not underestimate the demands of juggling multiple interviews, each of which may have multiple steps. Developing and following a prioritization scheme for submitting applications can help later with planning time to complete coding exercises.
Choose Your Tools
Unless specified in the directions, candidates must choose an appropriate toolkit and/or programming language. Time and skill permitting, it is good practice to choose a tool or language that is used by the employer’s team. Techniques mentioned in the job posting are probably the best source of such information. Some data science teams maintain a blog on the company’s website or have public repos on GitHub, which can be useful. Finally, recent conference talks by and personal GitHub repos belonging to members of the data science team can provide hints.
Making an early decision on the toolkit can help with planning work sessions. If the tools being used are less familiar, then additional time should be allotted to complete the take-home exercise.
Keep It Simple
Another common mistake is attempting to use unnecessarily complex algorithms. Start with a simple but appropriate technique for the problem and then work towards more sophisticated methods. For example, if a question involves binary classification, it is good practice to evaluate how logistic regression performs before moving on to methods like XGBoost.
Keeping the analysis basic (at least at the beginning) shows the candidate can think carefully and logically about a problem rather than immediately reaching for the algorithm or method du jour. For some employers, simpler methods are actually more desirable than complex ones, due to their interpretability and ease of use.
Organize and Narrate
Carefully organize code and annotate it so that a colleague could understand it without much effort. Functions should be documented using a language-appropriate style and ample comments should be provided throughout the code.
If a tool like Jupyter notebook is used, make full use of the markdown formatting features. Headings should make it easy to identify key information and answers to exercise questions. Narrative text should explain not only what is happening, but also what was attempted previously, and how the analysis could be further expanded. Finally, demonstrate mastery of the methods utilized by describing their strengths and weaknesses.
Submission requiring plain-text coding files have more limited formatting options for narrative text. Nevertheless, comment blocks and plain-text headings can be used to fulfill a role similar to markdown.
Practice and Get Feedback
Successful employment as a data scientist requires mastery of a basic set of tools and concepts, as well as preparation for interviews. At our 12-week Data Science Bootcamp, Senior Data Scientist instructors, along with our award-winning Careers Team, provide end-to-end training and career support for individuals transitioning into data science and related roles. Preparation for different stages of the interview process is a key element of this training, and we provide practice take-home exercises, follow-up discussion groups, and code reviews to ensure success for our students.
Learn more about our 12-Week Data Science Bootcamp here.