Computer Science and Engineering, Department of


Date of this Version



Liu, Yi, Chulwoo Pack, Leen-Kiat Soh, and Elizabeth Lorang, "Document Images and Machine Learning: A Collaboratory Between the Library of Congress and the Image Analysis for Archival Discovery (Aida) Lab at the University of Nebraska, Lincoln, NE," presented at the Library of Congress, 22 August 2019.


This presentation is part of the project, "Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project." The final report for this project, other presentations delivered to the Library of Congress, and work in progress reports are also available via the UNL Digital Commons.


This presentation summarized and presented preliminary results from the first weeks of work conducted by the Aida research team in response to Library of Congress funding notice ID 030ADV19Q0274, “The Library of Congress – Pre-processing Pilot.” It includes overviews of projects on historic document segmentation, document classification, document quality assessment, figure and graph extraction from historic documents, text-line extraction from figures, subject and objective quality assesments, and digitization type differentiation.