Statistics, Department of


Date of this Version



A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Statistics,, Under the Supervision of Professor Christopher R. Bilder. Lincoln, Nebraska: November, 2012

Copyright (c) 2012 Boan Zhang


Group testing, where groups of individual specimens are composited to test for the presence or absence of a disease (or some other binary characteristic), is a procedure commonly used to reduce the costs of screening a large number of individuals. Statistical research in group testing has traditionally focused on a homogeneous population, where individuals are assumed to have the same probability of having a disease. However, individuals often have different risks of positivity, so recent research has examined regression models that allow for heterogeneity among individuals within the population. This dissertation focuses on two problems involving group testing regression models.

For the first problem, we examine group testing regression models when identification of the positive and negative statuses for individuals is performed. The identification aspect leads to additional tests, known as “retests,” beyond those performed for initial groups of individuals. We show how regression models can be fit in this setting while also incorporating the extra information from these retests. Through Monte Carlo simulations, we present evidence that significant gains in efficiency occur by incorporating retesting information. Furthermore, we demonstrate that some group testing protocols can actually lead to more efficient estimates than individual testing when diagnostic tests are imperfect. Finally, we show that halving and matrix testing protocols are the most efficient to use in application.

For the second problem, we consider situations when individuals are tested in groups for multiple diseases simultaneously. This problem is important because assays frequently screen for more than one disease at a time. When these assays are used in a group testing setting, the individual positive/negative statuses consist of unobserved, correlated random variables. To estimate models in this setting, we develop an expectation-solution based algorithm that provides consistent parameter estimates and natural large-sample inference procedures.

Advisor: Christopher R. Bilder