Computer Science and Engineering, Department of

 

First Advisor

Stephen Donald Scott

Date of this Version

Winter 12-2022

Document Type

Article

Citation

@mastersthesis{bevers, author = "Mitchell DeHaven", title = "BEVERS: A General, Simple, and Performant Framework for Automatic Fact Verification", school = "University of Nebraska-Lincoln", year = 2022, address = "Lincoln, NE", month = nov }

Comments

A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Stephen Donald Scott. Lincoln, Nebraska: December, 2022

Copyright © 2022 Mitchell DeHaven

Abstract

Fact verification has become an important process, primarily done manually by humans, to verify the authenticity of claims and statements made online. Increasingly, social media companies have utilized human effort to debunk false claims on their platforms, opting to either tag the content as misleading or false, or removing it entirely to combat misinformation on their sites. In tandem, the field of automatic fact verification has become a subject of focus among the natural language processing (NLP) community, spawning new datasets and research. The most popular dataset is the Fact Extraction and VERification (FEVER) dataset. In this thesis an end-to-end fact verification system is built and trained based on the FEVER dataset. Our system utilizes traditional document retrieval and pretrained transformer models to form predictions. Our system does not deviate significantly from a standard approach, however we thoroughly examine design decisions and data representations to further improve the model. We suspect other results in the literature under optimized their approaches, given that we perform much better with similar system design. Finally, our system is compared against other systems through the FEVER blind test dataset and sets a new state of the art for the task while utilizing relatively simple approaches and smaller models than other systems. Our system attains the highest FEVER score on the task, while scoring second on pure accuracy, and finally our evidence retrieval system achieves the highest recall among reported results of other systems.

Adviser: Stephen Donald Scott

Share

COinS