Date of this Version
Lindstrom T, Grear DA, Buhnerkempe M, Webb CT, Miller RS, et al. (2013) A Bayesian Approach for Modeling Cattle Movements in the United States: Scaling up a Partially Observed Network. PLoS ONE 8(1): e53432. doi:10.1371/journal.pone.0053432
Networks are rarely completely observed and prediction of unobserved edges is an important problem, especially in disease spread modeling where networks are used to represent the pattern of contacts. We focus on a partially observed cattle movement network in the U.S. and present a method for scaling up to a full network based on Bayesian inference, with the aim of informing epidemic disease spread models in the United States. The observed network is a 10% state stratified sample of Interstate Certificates of Veterinary Inspection that are required for interstate movement; describing approximately 20,000 movements from 47 of the contiguous states, with origins and destinations aggregated at the county level. We address how to scale up the 10% sample and predict unobserved intrastate movements based on observed movement distances. Edge prediction based on a distance kernel is not straightforward because the probability of movement does not always decline monotonically with distance due to underlying industry infrastructure. Hence, we propose a spatially explicit model where the probability of movement depends on distance, number of premises per county and historical imports of animals. Our model performs well in recapturing overall metrics of the observed network at the node level (U.S. counties), including degree centrality and betweenness; and performs better compared to randomized networks. Kernel generated movement networks also recapture observed global network metrics, including network size, transitivity, reciprocity, and assortativity better than randomized networks. In addition, predicted movements are similar to observed when aggregated at the state level (a broader geographic level relevant for policy) and are concentrated around states where key infrastructures, such as feedlots, are common. We conclude that the method generally performs well in predicting both coarse geographical patterns and network structure and is a promising method to generate full networks that incorporate the uncertainty of sampled and unobserved contacts.