Acquisition of ‘Deep’ Knowledge from Shallow Corpus Statistics Dekang Lin Google Inc. Abstract. Many hard problems in natural language processing seem to require knowledge and inference about the real world. For example, consider the referent of the pronoun ‘his’ in the following sentences: (1) John needed his friends (2) John needed his support (3) John offered his support A human reader would intuitively know that ‘his’ in (1) and (3) is likely to refer to John, whereas it must refer to someone else in (2). Since the three sentences have exactly the same syntactic structure, the difference cannot be explained by syntax alone. The resolution of the pronoun references in (2) seem to hinges on the fact that one never needs one’s own support (since one already has it). I will present a series of knowledge acquisition methods to show that seemingly deep linguistic or even world knowledge may be acquired with rather shallow corpus statistics. I will also discuss the evaluation of the acquired knowledge by making use of them in applications. A. Farzindar and V. Keselj (Eds.): Canadian AI 2010, LNAI 6085, p. 1, 2010. c Springer-Verlag Berlin Heidelberg 2010 University and Industry Partnership in NLP, Is It Worth the “Trouble”? Guy Lapalme Laboratoire de Recherche Appliqu´ee en Linguistique Informatique (RALI) Laboratory for Applied Research in Computational Linguistics D´epartment d’informatique et de recherche op´erationnelle Computer Science and Operational Research Department Universit´e de Montr´eal
[email protected] http://www.iro.umontreal.ca/~ lapalme Abstract. We will present the research and products developed by members of the RALI for more than 15 years in many areas of NLP: translation tools, spelling checkers, summarization, text generation, information extraction and information retrieval. We will focus on projects involving industrial partners and will point out what we feel to be the benefits and the constraints in these types of projects for both parties. We will not describe in details the contents of each project but we will report some global lessons that we learned from these experiences. A. Farzindar and V. Keselj (Eds.): Canadian AI 2010, LNAI 6085, p. 2, 2010. c Springer-Verlag Berlin Heidelberg 2010 Corpus-Based Term Relatedness Graphs in Tag Recommendation Evangelos Milios Faculty of Computer Science Dalhousie University, Halifax, Canada
[email protected] http://www.cs.dal.ca/~eem Abstract. A key problem in text mining is the extraction of relations between terms. Hand-crafted lexical resources such as Wordnet have limitations when it comes to special text corpora. Distributional approaches to the problem of automatic construction of thesauri from large corpora have been proposed, making use of sophisticated Natural Language Processing techniques, which makes them