Center for Digital Research in the Humanities


Date of this Version


Document Type



Lorang, Elizabeth, and Leen-Kiat Soh. "Interim Report, HD-51897-14" (report to National Endowment for the Humanities, University of Nebraska-Lincoln, 2015).


Copyright (c) 2015 Elizabeth Lorang & Leen-Kiat Soh


In the first six months of work on "Image Analysis for Archival Discovery," the project team has made significant strides toward our goal of analyzing more than 7 million newspaper pages in Chronicling America for poetic content. Although we have made some adjustments to our work plan, we remain on task to perform the major research outlined in our proposal. Activities undertaken from June–November 2014:
• Preparation of training set images (completed)
• Processing of initial data sets to extract/derive features from image data (completed)
• Development of algorithms for describing image characteristics (completed, but future iterations/revisions are likely)
• Training of classifier to recognize poetic content (completed)
• Analyzing of preliminary results and revising of algorithms to achieve higher accuracy rates (completed)
• Processing subset of Chronicling America images (in progress)
• Preparation of program for dividing full page images into image snippets for processing (completed)
• Presentation on preliminary work and results at the Digital Humanities 2014 conference in Lausanne, Switzerland (completed)
• Presentation on project work at Digital Library Federation Forum (completed)
• Recruitment of input from humanities specialists and librarians, archivists, and information professionals outside the advisory board (ongoing)
• Identification of internal and external funding possibilities for future project development (in progress)
• Development of project website, (completed)