Computer Science and Engineering, Department of


Date of this Version



University of Nebraska-Lincoln, Computer Science and Engineering
Technical Report # TR-UNL-CSE-2003-0003 05/16/2003


In this paper, we investigate the effectiveness of Citation K-Nearest Neighbors (KNN) learning with noisy training datasets. We devise an authority measure associated with each training instance that changes based on the outcome of Citation KNN classification. The authority is increased when a citer’s classification had been right; and vice versa. We show that by modifying only these authority measures, the classification accuracy of Citation KNN improves significantly in a variety of datasets with different noise levels. We also identify the general characteristics of a dataset that affect the improvement percentages. We conclude that the new algorithm is able to regulate the roles of good and noisy training instances using a very simple authority measure to improve classification.