US Geological Survey


Date of this Version



Statistical Methodology 17 (2014) 67–81


This article is a U.S. government work, and is not subject to copyright in the United States.


Determining appropriate statistical distributions for modeling animal count data is important for accurate estimation of abundance, distribution, and trends. In the case of sea ducks along the U.S. Atlantic coast, managers want to estimate local and regional abundance to detect and track population declines, to define areas of high and low use, and to predict the impact of future habitat change on populations. In this paper, we used a modified marked point process to model survey data that recorded flock sizes of Common eiders, Long-tailed ducks, and Black, Surf, and White-winged scoters. The data come from an experimental aerial survey, conducted by the United States Fish & Wildlife Service (USFWS) Division of Migratory Bird Management, during which east-west transects were flown along the Atlantic Coast from Maine to Florida during the winters of 2009–2011. To model the number of flocks per transect (the points), we compared the fit of four statistical distributions (zero-inflated Poisson, zero-inflated geometric, zero-inflated negative binomial and negative binomial) to data on the number of species-specific sea duck flocks that were recorded for each transect flown. To model the flock sizes (the marks), we compared the fit of flock size data for each species to seven statistical distributions: positive Poisson, positive negative binomial, positive geometric, logarithmic, discretized lognormal, zeta and Yule–Simon. Akaike’s Information Criterion and Vuong’s closeness tests indicated that the negative binomial and discretized lognormal were the best distributions for all species for the points and marks, respectively. These findings have important implications for estimating sea duck abundances as the discretized lognormal is a more skewed distribution than the Poisson and negative binomial, which are frequently used to model avian counts; the lognormal is also less heavy-tailed than the power law distributions (e.g., zeta and Yule–Simon), which are becoming increasingly popular for group size modeling. Choosing appropriate statistical distributions for modeling flock size data is fundamental to accurately estimating population summaries, determining required survey effort, and assessing and propagating uncertainty through decision-making processes.