Computing, School of

 

School of Computing: Dissertations, Theses, and Student Research

Accessibility Remediation

If you are unable to use this item in its current form due to accessibility barriers, you may request remediation through our remediation request form.

First Advisor

Bonita Sharif

Second Advisor

Jitender Deogun

Committee Members

Jitender S. Deogun

Date of this Version

Fall 12-2-2020

Document Type

Thesis

Comments

A thesis presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of requirements for the degree of Master of Science

Major: Computer Science

Under the supervision of Professor Bonita Sharif. Lincoln, Nebraska, November 2020

Copyright © 2020 Sumeet Maan

Abstract

The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on short code snippets that do not represent how developers usually program in a realistic setting.

A comparative analysis between representational learning and non-representational learning (Support Vector Machine, Naive Bayes, Decision Tree, and Random Forest) is also presented. The results are obtained from an extensive set of experiments (with an 80/20 training and testing split) which show that representational learning (MLP) works well on our dataset reporting an average higher accuracy of 30% more for all tasks. Furthermore, a state-of-the-art method for feature engineering is proposed to extract features from the eye-tracking data. The average accuracy on all the tasks is 93.4% with a recall of 78.8% and an F1 score of 81.6%. We discuss the implications of these results on the future of automated prediction of developer expertise.

Advisor: Bonita Sharif

Share

COinS