Statistics, Department of


Date of this Version



Montesinos-Lopez OA, Montesinos-Lopez A, Crossa J, Eskridge K (2012) Sample Size under Inverse Negative Binomial Group Testing for Accuracy in Parameter Estimation. PLoS ONE 7(3): e32250. doi:10.1371/journal.pone.0032250


Copyright 2012 Montesinos-Lo´pez et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License. Used by permission.


Background:The group testing method has been proposed for the detection and estimation of genetically modified plants (adventitious presence of unwanted transgenic plants, AP). For binary response variables (presence or absence), group testing is efficient when the prevalence is low, so that estimation, detection, and sample size methods have been developed under the binomial model. However, when the event is rare (low prevalence

Methodology/Principal Findings: This research proposes three sample size procedures (two computational and one analytic) for estimating prevalence using group testing under inverse (negative) binomial sampling. These methods provide the required number of positive pools (rm), given a pool size (k), for estimating the proportion of AP plants using the Dorfman model and inverse (negative) binomial sampling. We give real and simulated examples to show how to apply these methods and the proposed sample-size formula. The Monte Carlo method was used to study the coverage and level of assurance achieved by the proposed sample sizes. An R program to create other scenarios is given in Appendix S2.

Conclusions: The three methods ensure precision in the estimated proportion of AP because they guarantee that the width (W) of the confidence interval (CI) will be equal to, or narrower than, the desired width (v), with a probability of c. With the Monte Carlo study we found that the computational Wald procedure (method 2) produces the more precise sample size (with coverage and assurance levels very close to nominal values) and that the samples size based on the Clopper-Pearson CI (method 1) is conservative (overestimates the sample size); the analytic Wald sample size method we developed (method 3) sometimes underestimated the optimum number of pools.