Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Enhancing Document Layout Analysis on Historical Newspapers: Visual Representation, Pseudo-Ground-Truth, and Downscaling

Chulwoo Pack, University of Nebraska - Lincoln

Abstract

One of the objectives of document image segmentation aims to decompose a digitized document image into a set of homogeneous regions by distinguishing text zones from non-textual ones. While page segmentation on constrained layouts and clean images has been successfully addressed in the past, segmentation on unconstrained layouts and noisy images, such as in historical newspapers, is still an open problem based on the following factors. First, using heuristic rules with conventional image processing techniques is less than optimal to cover the variations of layouts and image quality present in historical newspapers. Second, the robust segmentation performance of deep convolutional neural network (DCNN) typically requires a vast amount of accurately curated ground-truth, which is cost-intensive. Third, DCNNs usually require downscaling to process large historical newspaper images to fit the GPU memories, which usually leads to precision loss. Given such challenging factors, we intend to improve the accuracy and time-efficiency of the numerical process of identifying a set of textual regions from given unconstraint, noisy, and large historical newspaper images. First, we investigate whether it is worthwhile to utilize conventional geometric feature-based visual representation for segmenting historical newspapers with and without a DCNN. Second, we investigate whether we can generate effective pseudo-ground-truth using document degradation models to address the need for expensive labeling of datasets. Third, we investigate whether we can adaptively downscale large images by preserving visual cues relevant to the layout structure to mitigate the precision loss. Our research contributes to document image segmentation and analysis in general for noisy document images, especially in the domain of historical newspapers. Specifically, our research proposes and evaluates novel methods of utilizing image processing techniques: (1) region-growing merging based on the Docstrum geometric feature, (2) fusion of Gravity-map that is a Voronoi-tessellation-based visual representation and a DCNN, and (3) adaptive image downscaling combining the strengths of content-independent and content-aware strategies) and document image quality assessment, which in turn helps generate more accurate spatial-related information efficiently by requiring less computing resources and the cost of ground-truthing.

Subject Area

Computer science|Information Technology

Recommended Citation

Pack, Chulwoo, "Enhancing Document Layout Analysis on Historical Newspapers: Visual Representation, Pseudo-Ground-Truth, and Downscaling" (2023). ETD collection for University of Nebraska-Lincoln. AAI30485052.
https://digitalcommons.unl.edu/dissertations/AAI30485052

Share

COinS