Computer Science and Engineering, Department of


First Advisor

Professor Stephen Scott

Date of this Version

Fall 12-1-2022


A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Professor Stephen Scott. Lincoln, Nebraska: November, 2022

Copyright © 2022 Mostafa Rafaiejokandan


Deep neural networks (DNNs) can perform impressively in many natural language processing (NLP) tasks, but their black-box nature makes them inherently challenging to explain or interpret. Self-Explanatory models are a new approach to overcoming this challenge, generating explanations in human-readable languages besides task objectives like answering questions. The main focus of this thesis is the explainability of NLP tasks, as well as how attention methods can help enhance performance. Three different attention modules are proposed, SimpleAttention, CrossSelfAttention, and CrossModality. It also includes a new dataset transformation method called Two-Documents that converts every dataset into two separate documents required by the offered attention modules. The proposed ideas are incorporated in a faithful architecture in which a module produces an explanation and prepares the information vector for the subsequent layers. The experiments are run on the ERASER Benchmark’s CoS-E dataset, restricting them to the transformer used in the baseline and only training data from the dataset while it requires common sense knowledge to improve the accuracy. Based on the results, the proposed solution produced an explanation that outperformed Token F1 by about 4%, while being about 1% more accurate.

Adviser: Stephen Scott