E-Book Overview
This book in the Edinburgh Textbooks in Empirical Linguistics series is a comprehensive introduction to the statistics currently used in corpus linguistics. Statistical techniques and corpus applications - whether oriented towards linguistics or language engineering - often go hand in glove, and corpus linguists have used an increasingly wide variety of statistics, drawing on techniques developed in a great many fields. This is the first one-volume introduction to the subject.
E-Book Content
cover title: author: publisher: isbn10 | asin: print isbn13: ebook isbn13: language: subject publication date: lcc: ddc: subject: next page > Statistics for Corpus Linguistics Edinburgh Textbooks in Empirical Linguistics Oakes, Michael P. Edinburgh University Press 0748610324 9780748610327 9780585137742 English Computational linguistics--Statistical methods. 1998 P98.5.S83O18 1998eb 410/.285 Computational linguistics--Statistical methods. cover If you like this book, buy it! next page > < previous=""> page_i next page > Page i Statistics for Corpus Linguistics < previous=""> page_i If you like this book, buy it! next page > < previous=""> page_ii next page > Page ii EDINBURGH TEXTBOOKS IN EMPIRICAL LINGUISTICS CORPUS LINGUISTICS by Tony McEnery and Andrew Wilson LANGUAGE AND COMPUTERS A PRACTICAL INTRODUCTION TO THE COMPUTER ANALYSIS OF LANGUAGE by Geoff Barnbrook STATISTICS FOR CORPUS LINGUISTICS by Michael Oakes COMPUTER CORPUS LEXICOGRAPHY by Vincent B.Y. Ooi EDITORIAL ADVISORY BOARD Ed Finegan University of Southern California, USA Dieter Mindit Freie Universität Berlin, Germany Bengt Altenberg Lund University, Sweden Knut Hofland Norwegian Computing Centre for the Humanities, Bergen, Norway Jan Aarts Katholieke Universiteit Nijmegen,The Netherlands Pain Peters Macquarie University, Australia If you would like information on forthcoming titles in this series, please contact Edinburgh University Press, 22 George Square, Edinburgh EH8 9LF < previous=""> page_ii If you like this book, buy it! next page > < previous=""> page_iii next page > Page iii EDINBURGH TEXTBOOKS IN EMPIRICAL LINGUISTICS Series Editors: Tony McEnery and Andrew Wilson Statistics for Corpus Linguistics Michael P. Oakes EDINBURGH UNIVERSITY PRESS < previous=""> page_iii If you like this book, buy it! next page > < previous=""> page_iv next page > Page iv © Michael P Oakes, 1998 Edinburgh University Press 22 George Square, Edinburgh EH8 9LF Typeset in 11/13pt Bembo by Koinonia, Manchester and printed and bound in Great Britain by The University Press, Cambridge A CIP record for this book is available from the Britain Library ISBN 0 7486 1032 4 (cased) ISBN 0 7486 0817 6 (paperback) The right of Michael P Oakes to be identified as author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. < previous=""> page_iv If you like this book, buy it! next page > < previous=""> next page > page_v Page v Contents Preface xii Acknowledgements xiii Abbreviations 1 Basic statistics xv 1 1 1 Introduction 2 2 Describing data 2 2.1 Measures of central tendency 3 2.2 Probability theory and the normal distribution 6 2.3 Measures of variability 7 2.4 The z score 9 2.5 Hypothesis testing 9 2.6 Sampling 10 3 Comparing groups 10 3.1 Parametric versus non-parametric procedures 11 3.2 Parametric comparison of two groups 11 3,2.1 The t test for independent samples 14 3.2.2 Use of the t test in corpus linguistics 15 3.2.3 The matched pairs t test 16 3.2.