Libraries at University of Nebraska-Lincoln


Date of this Version

Summer 4-13-2021

Document Type



The scope and application of present web technology in the library and information sector have been increasingly transforming in terms of storage, processing, and delivery of services. Libraries, information centers, archives, museums, etc. are being driven to add meaningful and interoperable web-based library services to address the growing information needs of the users. One of the major developments in recent times is the adoption of semantic web technologies in providing web-based library services. Semantic web technology is an advanced web interface that offers structured web-based data and allows organizations or institutions to describe, communicate, retrieve and re-distribute over the web. It enables the library community to include additional information from other external resources to provide enriched information services to the users. Transformation, integration, and publication of library data as linked open data (LOD) is one such service. A library database typically holds two types of data viz., bibliographic data and authority data. There are many types of authority data that include personal names, corporate names, meeting names, geographic names, chronological terms, topical terms, etc. A personal name authority record provides several attributes of a person who may be an author, a contributor, an editor, a translator, an illustrator, etc., and acts as a preferred term or an access point for the library online catalogue service. The main objective of this paper is to transform MARC 21-based personal name authority data of a Koha database to RDF triple format and publish them as Linked Open Data (LOD) with enrichment of external LOD personal name authority datasets like Congress Name Authority File (LCNAF), Virtual International Authority File (VIAF), etc. Personal name authority LOD dataset adds a persistent URI to each personal name heading and makes the data easily accessible over the web. A Workflow Model (Figure-1) has been proposed to visualize the steps, operations, and components for converting personal name authority data to a LOD dataset. OpenRefine (version 3.2,, an open-source tool, is used for the cleansing, reconciliation, and transformation of unstructured and messy data. In this research work, the OpenRefine tool has played a crucial role in facilitating a wide range of activities, from data refinement to the insertion of the URI column, link generation, reconciliation of external data sources, conversion of source format to different RDF formats such as RDF/XML, N-triples, Turtle, JSON-LD, etc. The produced personal name authority LOD dataset may further be used by the organizations or institutions for their advanced online catalogue service.