E-Book Overview
This text demonstrates how to extract knowledge by finding meaningful connections among data spread throughout the Web. Readers learn methods and algorithms from the fields of information retrieval, machine learning, and data mining which, when combined, provide a solid framework for mining the Web. The authors walk readers through the algorithms with the aid of examples and exercises.
E-Book Content
DATA MINING THE WEB Uncovering Patterns in Web Content, Structure, and Usage ZDRAVKO MARKOV AND DANIEL T. LAROSE Central Connecticut State University New Britain, CT WILEY-INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION DATA MINING THE WEB DATA MINING THE WEB Uncovering Patterns in Web Content, Structure, and Usage ZDRAVKO MARKOV AND DANIEL T. LAROSE Central Connecticut State University New Britain, CT WILEY-INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION C 2007 by John Wiley & Sons, Inc. All rights reserved. Copyright Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, 201-748–6011, fax 201-748–6008, or online at http://www.wiley.com/go/permission. Limit of Lia