Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

A New Feature Engineering-Driven Pre-Trained Language Model for Enabling Semantically Enriched Natural Language Processing Tasks

Kimia Ameri, University of Nebraska - Lincoln


Natural language processing (NLP) techniques had significantly improved by introducing pre-trained language models (PLM). The pre-training method uses unannotated data for self-supervised training and can be applied to downstream tasks through fine-tuning and few-shot training. Even though PLMs can recognize useful linguistic information in unlabeled texts, factual knowledge is generally not well represented. There is substantial evidence that attention-based models utilizing the Transformers concept perform better than traditional algorithms in several NLP tasks. Transformers utilize an encoder-decoder architecture, and consist of multiple attention heads stacked on top of each other. In an encoder, a Transformer generates abstract representations of tokens based on their relationship to all tokens in a sequence. The effectiveness of such models can be significantly improved by explicitly feeding them syntactic information. The use of syntactic information such as part-of-speech (POS) tags may be beneficial to a complex model such as a Transformer. In this dissertation we introduce a new feature engineering approach that enables us to utilize syntactical information inside Transformer-based language models. We then utilized this feature engineering method to design and develop a new language model, the Grammar-Enriched language Model (GEM), based on Transformers that enables it to utilize this syntactical information in the pre-training process. We show that our proposed model, GEM outperforms baseline models on all downstream tasks from the GLUE benchmark by 4.6 points. On GLUE datasets, GEM improved the base average score by 4.6 points, increasing it from 79.6 to 84.2 points. On single-sentence tasks, GEM improved the average score by 4.46 points. The average improvement for similarity and paraphrasing tasks with GEM was 7.51 points over the BERT-based model and 6.84 points over the BERT large model. For natural language inference tasks, GEM showed an average improvement of 2.48 points over the base model.

Subject Area

Computer Engineering

Recommended Citation

Ameri, Kimia, "A New Feature Engineering-Driven Pre-Trained Language Model for Enabling Semantically Enriched Natural Language Processing Tasks" (2022). ETD collection for University of Nebraska - Lincoln. AAI29999709.