Date of this Version
The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on short code snippets that do not represent how developers usually program in a realistic setting.
A comparative analysis between representational learning and non-representational learning (Support Vector Machine, Naive Bayes, Decision Tree, and Random Forest) is also presented. The results are obtained from an extensive set of experiments (with an 80/20 training and testing split) which show that representational learning (MLP) works well on our dataset reporting an average higher accuracy of 30% more for all tasks. Furthermore, a state-of-the-art method for feature engineering is proposed to extract features from the eye-tracking data. The average accuracy on all the tasks is 93.4% with a recall of 78.8% and an F1 score of 81.6%. We discuss the implications of these results on the future of automated prediction of developer expertise.
Adviser: Bonita Sharif