U.S. Department of Commerce


Date of this Version



Published in Marine Ecology 32 (2011) 404–418; doi:10.1111/j.1439-0485.2011.00466.x


Despite more than half a century of criticism, significance testing continues to be used commonly by ecologists. Significance tests are widely misused and misunderstood, and even when properly used, they are not very informative for most ecological data. Problems of misuse and misinterpretation include: (i) invalid logic; (ii) rote use; (iii) equating statistical significance with biological importance; (iv) regarding the P-value as the probability that the null hypothesis is true; (v) regarding the P-value as a measure of effect size; and (vi) regarding the P-value as a measure of evidence. Significance tests are poorly suited for inference because they pose the wrong question. In addition, most null hypotheses in ecology are point hypotheses already known to be false, so whether they are rejected or not provides little additional understanding. Ecological data rarely fit the controlled experimental setting for which significance tests were developed. More satisfactory methods of inference assess the degree of support which data provide for hypotheses, measured in terms of information theory (model-based inference), likelihood ratios (likelihood inference) or probability (Bayesian inference). Modern statistical methods allow multiple data sets to be combined into a single likelihood framework, avoiding the loss of information that can occur when data are analyzed in separate steps. Inference based on significance testing is compared with model-based, likelihood and Bayesian inference using data on an endangered porpoise, Phocoena sinus. All of the alternatives lead to greater understanding and improved inference than provided by a P-value and the associated statement of statistical significance.