Libraries, University of Nebraska-Lincoln

Copyright, Fair Use, Scholarly Communication, etc.

Accessibility Remediation

If you are unable to use this item in its current form due to accessibility barriers, you may request remediation through our remediation request form.

Dataset Search in Biodiversity Research: Do Metadata in Data Repositories Reflect Scholarly Information Needs?

ORCID IDs

Löffler https://orcid.org/0000-0001-6423-7427

Wesp https://orcid.org/0000-0002-8601-6032

König-Ries https://orcid.org/0000-0002-2382-9722

Klan https://orcid.org/0000-0002-1856-7334

Document Type

Article

Date of this Version

3-24-2021

Citation

PLoS One (2021) 16(3): e0246099

https://doi.org/10.1371/journal.pone.0246099

Editor: Hussein Suleman, University of Cape Town, South Africa

Received: July 5, 2019 Accepted: January 13, 2021 Published: March 24, 2021

Peer Review History

PLOS recognizes the benefits of transparency in the peer review process; therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. The editorial history of this article is available here: https://doi.org/10.1371/journal.pone.0246099

Data Availability Statement

The code and data are available in GitHub repository: https://github.com/fusion-jena/QuestionsMetadataBiodiv. In addition, the data has been submitted to the iDiv data portal (https://idata.idiv.de/)

Comments

License: Creative Commons Attribution (CC BY)

Abstract

The increasing amount of publicly available research data provides the opportunity to link and integrate data in order to create and prove novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consuming task in daily research practice. In this study, we explore what hampers dataset retrieval in biodiversity research, a field that produces a large amount of heterogeneous data. In particular, we focus on scholarly search interests and metadata, the primary source of data in a dataset retrieval system. We show that existing metadata currently poorly reflect information needs and therefore are the biggest obstacle in retrieving relevant data. Our findings indicate that for data seekers in the biodiversity domain environments, materials and chemicals, species, biological and chemical processes, locations, data parameters and data types are important information categories. These interests are well covered in metadata elements of domain-specific standards. However, instead of utilizing these standards, large data repositories tend to use metadata standards with domain-independent metadata fields that cover search interests only to some extent. A second problem are arbitrary keywords utilized in descriptive fields such as title, description or subject. Keywords support scholars in a full text search only if the provided terms syntactically match or their semantic relationship to terms used in a user query is known.

Download

Included in

Intellectual Property Law Commons, Scholarly Communication Commons, Scholarly Publishing Commons

COinS

Libraries, University of Nebraska-Lincoln

Copyright, Fair Use, Scholarly Communication, etc.

Accessibility Remediation

Dataset Search in Biodiversity Research: Do Metadata in Data Repositories Reflect Scholarly Information Needs?

ORCID IDs

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Libraries, University of Nebraska-Lincoln

Copyright, Fair Use, Scholarly Communication, etc.

Accessibility Remediation

Dataset Search in Biodiversity Research: Do Metadata in Data Repositories Reflect Scholarly Information Needs?

Authors

ORCID IDs

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links