Date of this Version
Lorang, Elizabeth, Leen-Kiat Soh, Yi Liu, and Chulwoo Pack, "Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, and Augmented Description," delivered to the Library of Congress, 6 November 2019.
Includes framing, overview, and discussion of the explorations pursued as part of the Digital Libraries, Intelligent Data Analytics, and Augmented Description demonstration project, pursued by members of the Aida digital libraries research team at the University of Nebraska-Lincoln through a research services contract with the Library of Congress. This presentation covered: Aida research team and background for the demonstration project; broad outlines of “Digital Libraries, Intelligent Data Analytics, and Augmented Description”; what changed for us as a research team over the collaboration and why; deliverables of our work; thoughts toward “What next”; and deep-dives into the explorations. The machine learning explorations, which focus on historic document materials from the Library of Congress, include image segmentation; visual context extraction from textual materials; text extraction from images; document/corpus quality assessment; differentiation among documents created via different means; differentiation among printed, handwritten, and mixed content; and metadata generation. Preliminary take-aways discussed include an expanded sense of how these projects may be useful, with greater emphasis on internal use within the Library of Congress; consideration of how crowd-sourced information can aid in machine-learning, as well as what may be well-suited to the crowd, to the machine, and to domain experts; the need for analysis of the materials through a variety of strategies to inform machine learning; and greater awareness of the full range of resources--computational, human, technical, social--necessary to do this work.