E-Book Overview
This is a collection of 120 real data science interview questions, covering a wide range of questions you might face when interviewing for a data science position. It contains questions contributed by both data scientists and individuals interviewing for data science positions. This purchase also contains links to crowd-sourced answers to 25+ of the most common questions. We’ve covered these interview questions so you can know what to expect, what to study, and find the best way to prepare for your specific interview.
E-Book Content
120 DATA SCIENCE INTERVIEW QUESTIONS COMPILED AND CREATED BY: CARL SHAN, MAX SONG, HENRY WANG, AND WILLIAM CHEN 1 INTRODUCTION This guide is meant to bridge the gap between the knowledge of a recent graduate and the skillset required to become a data scientist. By reading this guide and learning how to answer these questions, recent graduates will equip themselves with the expected knowledge and skills of a data scientist. To help readers with these goals, we’ve gathered 120 interview questions in product metrics, programming and databases, probability, experimentation and inference, data analysis, and predictive modeling. These questions are all either real data science interview questions or inspired by real data science interview questions, and should help readers develop the skills needed to succeed in a data science role. The role of a data scientist is highly malleable and company dependent. However, the general skillset needed is similar. Candidates need: • Technical skills - data analysis and programming • Business/product intuition - metrics and identifying opportunities for impact • Communication ability - clarity in explaining findings and insights To prepare for your interview, you may want to brush up by reviewing some probability, data analysis, SQL, coding, and experimental design. The questions in this guide should help you do so. The background of data science applicants varies wildly, so interviews may generally be more holistic and test your intuition, analytic, and communication abilities rather than focusing on specific technical concepts. Prepare to discuss your past work involving analyzing large and complicated datasets, defending your approaches and communicating what you learned during your project. Expect questions involving how to measure “goodness” of a feature on the company’s product, and be sure to approach these problems in a scientific and principled way. You have a good chance of getting a product metrics or experimentation question based on some actual questions the company is tackling at this time. Check up on your company’s engineering / data blog and see if anything’s relevant. Be familiar with A/B testing and common metrics that companies similar to the one you are interviewing for may use. Brush up on your Python (especially iPython notebook) and/or R abilities to prepare for a potential live data analysis problem. And finally, of course, follow the general interview advice. Prepare to elaborate on related projects from your resume. Be enthusiastic. Share your thoughts with your interviewer as you’re going through a problem or doing a piece of analysis. And be sure to answer the question! You have our best wishes! Carl, Max, Henry, and William Please feel free to reach out to us with questions, comments and suggestions at www.datasciencehandbook.me 2 CONTENTS PREDICTIVE MODELING 4 PROGRAMMING 6 PROBABILITY 8 STATISTICAL INFERENCE 11 DATA ANALYSIS 13 PRODUCT METRICS 16 COMMUNICATION 18 DATA SCIENCE INTERVIEW QUESTIONS 3 PREDICTIVE MODELING 1 (Given a Dataset) Analyze this dataset and give me a model that can predict this response variable. 2 What could be