Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Machine learning with incomplete information

Yaling Zheng, University of Nebraska - Lincoln


Machine learning algorithms detects patterns, regularities, and rules from the training data and adjust program actions accordingly. For example, when a learner (a computer program) sees a set of patient cases (patient records) with corresponding diagnoses, it can predict the presence of a disease for future patients. A somewhat unrealistic assumption in typical machine learning applications is that data is freely available. In my dissertation, I will present our research efforts to mitigate this assumption in the areas of active machine learning and budgeted machine learning. ^ In the area of active machine learning under the setting the labels of the instances have to be purchased, it is often assumed that there exists a perfect labeler labeling the chosen instances in the active machine learning setting. However it is possible that the labeler is not perfect, or it is possible there exists multiple noisy labelers with different known costs and different unknown accuracies, such as the Amazon Mechanical Turk. I will present our algorithms and experimental results of active learning from multiple noisy labelers with varied costs, which are based on ranking the labelers according to their estimated accuracies and costs. The experimental results show that our algorithms outperform those algorithms in the literature. ^ In the area of budgeted machine learning under the setting that the class label of every instance is known while the feature values of the instances have to be purchased at a cost, subject to an overall budget, the challenge to the learner is to decide which attributes of which instances will provide the best model from which to learn. I will present our budgeted learning algorithms of naive Bayes. Most of our algorithms perform well compared to existing algorithms in the literature. I will also present our algorithms for this budgeted learning of Bayesian network, which is a generalization of naive Bayes. Experimental results show that some of our algorithms outperform those algorithms in the literature.^

Subject Area

Computer Science

Recommended Citation

Zheng, Yaling, "Machine learning with incomplete information" (2011). ETD collection for University of Nebraska - Lincoln. AAI3487272.