Fast Text Compression With Neural Networks


E-Book Content

Fast Text Compression with Neural Networks Matthew V. Mahoney Florida Institute of Technology 150 W. University Blvd. Melbourne FL 32901 [email protected] Abstract Neural networks have the potential to extend data compression algorithms beyond the character level n-gram models now in use, but have usually been avoided because they are too slow to be practical. We introduce a model that produces better compression than popular Limpel-Ziv compressors (zip, gzip, compress), and is competitive in time, space, and compression ratio with PPM and BurrowsWheeler algorithms, currently the best known. The compressor, a bit-level predictive arithmetic encoder using a 2 layer, 4 × 106 by 1 network, is fast (about 104 characters/second) because only 4-5 connections are simultaneously active and because it uses a variable learning rate optimized for one-pass training. Keywords: Neural networks, Text compression, Data compression, On-line training/learning, Maximum entropy, Prediction, Efficiency. 1. Introduction One of the motivations for using neural networks for data compression is that they excel in complex pattern recognition. Standard compression algorithms, such as LimpelZiv or PPM (Bell, Witten, and Cleary, 1989) or BurrowsWheeler (Burrows and Wheeler, 1994) are based on simple n-gram models: they exploit the nonuniform distribution of text sequences found in most data. For example, the character trigram the is more common than qzv in English text, so the former would be assigned a shorter code. However, there are other types of learnable redundancies that cannot be modeled using n-gram frequencies. For example, Rosenfeld (1996) combined word trigrams with semantic associations, such as “fire...heat”, where certain pairs of words are likely to occur near each other
You might also like

High Performance Data Mining
Authors: Guo , Grossman. (eds.)    168    0


Computer Science Handbook
Authors: Allen B. Tucker    172    0


Calculs Et Visualisation En Nombres Complexes
Authors: Testard L.    85    0


Introduction To Information Theory And Data Compression
Authors: D.C. Hankerson , Greg A. Harris , Peter D. Johnson Jr.    120    0


Coding Theory - Algorithms, Architectures, And Applications
Authors: Andre Neubauer , Jurgen Freudenberger , Volker Kuhn    107    0


Digital Image Processing: Piks Scientific Inside
Authors: William K. Pratt    123    0


Introduction To Lambda Calculus
Authors: Barendregt H. , Barendsen E.    98    0


Beginning Python
Authors: Peter C. Norton , Alex Samuel , Dave Aitel , Eric Foster-Johnson , Leonard Richardson , Jason Diamond , Aleatha Parker , Michael Roberts    137    0


Synthesis And Optimization Of Dsp Algorithms
Authors: Constantinides , Cheung , Luk.    113    0


Linear Programming And Its Applications
Authors: H.A. Eiselt , C.-L. Sandblom    95    0