Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Adaptive segmentation of document images

Donald Robert Sylwester, University of Nebraska - Lincoln

Abstract

A method is presented for the efficient segmentation of text lines from scanned images of technical documents. The method has been implemented in the ARXYC (Adaptive Recursive XY Cut) algorithm, which constructs an XY-tree to represent the geometric layout structure of a page image in which the text lines are found as leaf nodes. Geometric layout analysis is a subcomponent of the Document Image Analysis processing sequence and is typically preceded by scanning a document into a pixel map, preprocessing of the pixel map to reduce noise and remove skew and by thresholding to a binary image, and typically followed by a mapping of the geometric layout to a function representation and recovery of text and graphics from the pixel image. Technical documents are sufficiently varied in structure to be challenging to segmentation algorithms yet sufficiently regular to be amenable to analysis. The vast store of archived technical documentation attests to the importance of the task. ARXYC achieves high generality by depending on only a single primary parameter, the resolution-independent gap-ratio-threshold. ARXYC constructs an initial XY-tree in which the desired text lines are over-segmented into many fragments, then dynamically transforms the XY-tree to the target tree employing three elegant operators, cut, glue and flip, while adaptively applying the threshold to the merging of fragments into text lines. ARXYC monitors the dynamic changes in the structure of the XY-tree to avoid the most serious segmentation error, merging two fragments across the gap between columns. Results are shown for three experiments on an image set of 97 document pages from a variety of technical journals. The first selects a single fixed threshold for a set of documents based on a sample from that set. The second selects a single fixed threshold for a specific image based on intrinsic measures of the onset of column bridging. Finally, ARXYC adaptively applies a varying threshold to each image guided by the dynamic behavior of the XY-tree matching, on average, 98.8% of the ground truth text lines.

Subject Area

Computer science

Recommended Citation

Sylwester, Donald Robert, "Adaptive segmentation of document images" (2001). ETD collection for University of Nebraska-Lincoln. AAI3004625.
https://digitalcommons.unl.edu/dissertations/AAI3004625

Share

COinS