Center, Digital Research in the Humanities

CDRH Grant Reports

Accessibility Remediation

If you are unable to use this item in its current form due to accessibility barriers, you may request remediation through our remediation request form.

White Paper, HD-51897-14, Image Analysis for Archival Discovery (Aida), October 2016

ORCID IDs

Elizabeth M. Lorang

Date of this Version

10-3-2016

Document Type

Article

Citation

Lorang, Elizabeth, and Leen-Kiat Soh. "White Paper, HD-51897-14" (white paper for National Endowment for the Humanities, University of Nebraska-Lincoln, October 2016).

Comments

This document was also submitted as the Final Performance Report for grant HD-51897-14, in accordance with NEH Performance Reporting Requirements as revised January 2013. The only difference between the Final Performance Report and this document is the cover sheet and the first footnote.

Abstract

With its Office of Digital Humanities Start-up Grant, the Image Analysis for Archival Discovery (Aida) team set out to further develop image analysis as a methodology for the identification and retrieval of items of relevance within digitized collections of historic materials.1 Specifically, we sought to identify poetic content within historic newspapers, using Chronicling America's newspapers (http://chroniclingamerica.loc.gov/) as our test case. The project activities we undertook—both those completed and those in process—support this goal and align well with the activities proposed in our original funding application and as approved by NEH. To achieve our goal of creating an image processing-based system to identify poetic content in historic newspaper collections, however, we also made strategic decisions along the way that shifted some of our efforts from those we initially planned when we drafted our funding proposal three years ago.

During the grant period, the Aida team developed, trained, and tested a machine learning classifier that can identify poetic content in pages of digitized historic newspapers based only on visual signals. We published early results of this work in D-Lib Magazine in summer 2015. We have since undertaken a detailed case study that tests the application of our classifier and methodology to a test set of more than 22,000 newspaper page images from the period 1836-1840. Significantly, we shifted our emphasis from processing all pages from Chronicling America to conducting this thorough, critical analysis and case study. This shift in plans corresponds with our desire to explore image analysis as a methodology for connecting users of digital archives with materials of relevance.

Download

Included in

Computer Sciences Commons, Digital Humanities Commons, Literature in English, North America Commons

COinS

Center, Digital Research in the Humanities

CDRH Grant Reports

Accessibility Remediation

White Paper, HD-51897-14, Image Analysis for Archival Discovery (Aida), October 2016

ORCID IDs

Date of this Version

Document Type

Citation

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Center, Digital Research in the Humanities

CDRH Grant Reports

Accessibility Remediation

White Paper, HD-51897-14, Image Analysis for Archival Discovery (Aida), October 2016

Authors

ORCID IDs

Date of this Version

Document Type

Citation

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links