Channel-coded Feature Maps For Computer Vision And Machine Learning


E-Book Content

Link¨oping Studies in Science and Technology Dissertation No. 1160 Channel-Coded Feature Maps for Computer Vision and Machine Learning Erik Jonsson Department of Electrical Engineering Link¨ opings universitet, SE-581 83 Link¨oping, Sweden Link¨oping February 2008 Channel-Coded Feature Maps for Computer Vision and Machine Learning Erik Jonsson Department of Electrical Engineering Link¨ oping University SE-581 83 Link¨ oping Sweden Link¨ oping Studies in Science and Technology Dissertation No. 1160 c 2008 Erik Jonsson Copyright ISBN 978-91-7393-988-1 ISSN 0345-7524 Back cover illustration by Nikolina Oreˇskovi´c Printed by LiU-Tryck, Link¨ oping 2008 iii To Helena for patience, love and understanding iv v Abstract This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning. A channel-coded feature map is a soft histogram of joint spatial pixel positions and image feature values. Typical useful features include local orientation and color. Using these features, each channel measures the co-occurrence of a certain orientation and color at a certain position in an image or image patch. Channel-coded feature maps can be seen as a generalization of the SIFT descriptor with the options of including more features and replacing the linear interpolation between bins by a more general basis function. The general idea of channel coding originates from a model of how information might be represented in the human brain. For example, different neurons tend to be sensitive to different orientations of local structures in the visual input. The sensitivity profiles tend to be smooth such that one neuron is maximally activated by a certain orientation, with a gradually decaying activity as the input is rotated. This thesis extends previous work on using channel-coding ideas within computer vision and machine learning. By differentiating the channel-coded feature maps with respect to transformations of the underlying image, a method for image registration and tracking is constructed. By using piecewise polynomial basis functions, the channel coding can be computed more efficiently, and a general encoding method for N-dimensional feature spaces is presented. Furthermore, I argue for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose. The optimization of position, rotation and scale of the object in the image plane is then included in the optimization problem, leading to a simultaneous tracking and pose estimation algorithm. Apart from objects and poses, the thesis examines the use of channel coding in connection with Bayesian networks. The goal here is to avoid the hard discretizations usually required when Markov random fields are used on intrinsically continuous signals like depth for stereo vision or color values in image restoration. Channel coding has previously been used to design machine learning algorithms that are robust to outliers, ambiguities, and discontinuities in the training data. This is obtained by finding a linear mapping between channel-coded input and output values. This thesis extends this method with an incremental version and identifies and analyzes a key feature of the method – that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known. In contrast to a traditional supervised learning setting, the training examples are groups of unordered input-output points, where the correspondence structure within each group is unknown. This behavior is studied theoretically and the effect of outliers and convergence properties are analyzed. All presented methods have been evaluated experimentally. The work has been co
You might also like

Network Analysis: Methodological Foundations
Authors: Ulrik Brandes , Thomas Erlebach (auth.) , Ulrik Brandes , Thomas Erlebach (eds.)    178    0


Mri: Basic Principles And Applications
Authors: Mark A. Brown , Richard C. Semelka    145    0


Lectures On Image Processing
Authors: Morse B.S.    150    0


3d Structure From Images — Smile 2000: Second European Workshop On 3d Structure From Multiple Images Of Large-scale Environments Dublin, Irleand, July 1–2, 2000 Revised Papers
Authors: Paul Debevec (auth.) , Marc Pollefeys , Luc Van Gool , Andrew Zisserman , Andrew Fitzgibbon (eds.)    146    0


Digital Image Processing: Piks Scientific Inside
Authors: William K. Pratt    157    0



Introduction To Lambda Calculus
Authors: Barendregt H. , Barendsen E.    139    0


Object-oriented Programming Via Fortran 90-95
Authors: Ed Akin    147    0


Optimization Theory And Methods: Nonlinear Programming
Authors: Wenyu Sun , Ya-Xiang Yuan    186    0