Electrical & Computer Engineering, Department of
Date of this Version
Spring 4-27-2015
Document Type
Article
Abstract
Computer vision attempts to provide camera-equipped machines with visual perception, i.e., the capability to comprehend their surroundings through the analysis and understanding of images. The ability to perceive depth is a vital component of visual perception that enables machines to interpret the three-dimensional structure of their surroundings and allows them to navigate through the environment. In computer vision, depth perception is achieved via stereo matching, a process that identifies correspondences between pixels in images acquired using a pair of horizontally offset cameras. It is possible to calculate depths from correspondences or, more specifically, the positional offsets (disparities) between pixels in correspondence.
A stereo matching method implemented on massively parallel graphics hardware is presented that allows for recovery of highly accurate disparities in real-time. This method combines a pixel dissimilarity metric computed using both the gradients and the census transforms of the input images, a non-iterative local disparity selection scheme based on an efficient approximation of the well-known edge-preserving bilateral image filter, and a refinement technique that iteratively improves the accuracy of disparities. The refinement technique, which also benefits from the use of the bilateral filter, eliminates mismatches by penalizing disparities that disagree with the the disparity estimates generated using local disparity values and/or gradients.
When evaluated using the Middlebury stereo performance benchmark (version 3), the proposed method ranks first and second to date using the training and test image sets, respectively, in terms of the overall accuracy of stereo matching measured as the average percentage of pixels with the absolute disparity error greater than 2 pixels at the nominal image resolution. Simultaneously, the method achieves the lowest error rates for 5 out of 15 image pairs in the training set, and 3 out of 15 image pairs in the test set. This method is also shown to enable robust matching in the presence of radiometric distortions caused by changes in illumination or camera exposure. The high accuracy of matching, that is largely maintained in the presence of radiometric distortions and the ability to operate in real time, make the proposed method well-suited for applications such as robotic navigation and structure reconstruction.
Advisor: Lance C. Pérez
Included in
Other Computer Engineering Commons, Other Computer Sciences Commons, Other Electrical and Computer Engineering Commons
Comments
A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Engineering (Electrical Engineering), Under the Supervision of Professor Lance C. Pérez. Lincoln, Nebraska: April, 2015
Copyright (c) 2015 Jedrzej Kowalczuk