E-Book Overview
Data mining has become the fastest growing topic of interest in business programs in the past decade. The massive growth in data generation, often called big data, in science (weather, ecology, biosciences, any scientific field), social studies (politics, health, many other fields), as well as business (real-time data in retail from cash registers, in supply chains from vendor to retail, financial to include banking, investment, and insurance, and less conventional areas such as human resource management). In response, many schools have created (or are creating) Masters programs in business analytics. This book is intended to first describe the benefits of data mining in business, describe the process and typical business applications, describe the workings of basic data mining tools, and demonstrate each with widely available free software. This book is designed for masters students. But that overlaps with business professionals as most new masters programs in business analytics are delivered on-line.
E-Book Content
Data Mining Models Data Mining Models Second Edition David L. Olson Data Mining Models, Second Edition Copyright © Business Expert Press, LLC, 2018. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations, not to exceed 400 words, without the prior permission of the publisher. First published in 2016 by Business Expert Press, LLC 222 East 46th Street, New York, NY 10017 www.businessexpertpress.com ISBN-13: 978-1-94858-049-6 (paperback) ISBN-13: 978-1-94858-050-2 (e-book) Business Expert Press Big Data and Business Analytics Collection Collection ISSN: 2333-6749 (print) Collection ISSN: 2333-6757 (electronic) Cover and interior design by Exeter Premedia Services Private Ltd., Chennai, India Second edition: 2018 10 9 8 7 6 5 4 3 2 1 Printed in the United States of America. Abstract Data mining has become the fastest growing topic of interest in business programs in the past decade. This book is intended to first describe the benefits of data mining in business, describe the process and typical business applications, describe the workings of basic data mining models, and demonstrate each with widely available free software. This second edition updates Chapter 1, and adds more details on Rattle data mining tools. The book focuses on demonstrating common business data mining applications. It provides exposure to the data mining process, to include problem identification, data management, and available modeling tools. The book takes the approach of demonstrating typical business data sets with open source software. KNIME is a very easy-to-use tool, and is used as the primary means of demonstration. R is much more powerful and is a commercially viable data mining tool. We will demonstrate use of R through Rattle. We also demonstrate WEKA, which is a highly useful academic software, although it is difficult to manipulate test sets and new cases, making it problematic for commercial use. We will demonstrate methods with a small but typical business dataset. We use a larger (but still small) realistic business dataset for Chapter 9. Keywords big data, business analytics, clustering, data mining, decision trees, neural network models, regression models Contents Acknowledgments Chapter 1 Data Mining in Business Chapter 2 Business Data Mining Tools Chapter 3 Data Mining Processes and Knowledge Discovery Chapter 4 Overview of Data Mining Techniques Chapter 5 Data Mining Software Chapter 6 Regression Algorithms in Data Mining Chapter 7 Neural Networks in Data Mining Chapter 8 Decision Tree Algorithms Chapter 9 Scalability Notes References Index Acknowledgments I wish to recognize some of the many colleagues I have worke