USDA Agricultural Research Service --Lincoln, Nebraska
Title
Using Comparative Genomics to Reorder the Human Genome Sequence into a Virtual Sheep Genome
Document Type
Article
Date of this Version
2007
Abstract
Background: Is it possible to construct an accurate and detailed subgene-level map of a genome
using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the
sequences of other genomes?
Results: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were
determined and mapped with high sensitivity and low specificity onto the frameworks of the human,
dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence
hits to the cow and dog genomes were also converted to the equivalent human genome
coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the
correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC
comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to
construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail
and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering
about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the
BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/
cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed.
To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion
process allowed us to transfer human genes and other genome features to the virtual sheep
genome to display on a sheep genome browser.
Conclusion: We demonstrate that limited sequencing of BACs combined with positioning on a
well assembled genome and integrating locations from other less well assembled genomes can yield
extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are
currently limited.

Comments
Published in Genome Biology 2007, 8:R152.