Computer Science and Engineering, Department of


First Advisor

Bonita Sharif

Second Advisor

Jitender Deogun

Date of this Version

Fall 12-2-2020


A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Professor Bonita Sharif. Lincoln, Nebraska: November, 2020

Copyright © 2020 Sumeet Maan


The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on short code snippets that do not represent how developers usually program in a realistic setting.

A comparative analysis between representational learning and non-representational learning (Support Vector Machine, Naive Bayes, Decision Tree, and Random Forest) is also presented. The results are obtained from an extensive set of experiments (with an 80/20 training and testing split) which show that representational learning (MLP) works well on our dataset reporting an average higher accuracy of 30% more for all tasks. Furthermore, a state-of-the-art method for feature engineering is proposed to extract features from the eye-tracking data. The average accuracy on all the tasks is 93.4% with a recall of 78.8% and an F1 score of 81.6%. We discuss the implications of these results on the future of automated prediction of developer expertise.

Adviser: Bonita Sharif