Plant Science Innovation, Center for


Document Type


Date of this Version



Genome Biology and Evolution Advance Access published April 25, 2012 doi:10.1093/gbe/evs042


Copyright the Authors 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License


Despite intense investigation for over 25 years, the in vivo structure of plant mitochondrial genomes remains uncertain. Mapping studies and genome sequencing generally produce large circular chromosomes, whereas electrophoretic and microscopic studies typically reveal linear and multi-branched molecules. To more fully assess the structure of plant mitochondrial genomes, the complete sequence of the monkeyflower (Mimulus guttatus DC. line IM62) mtDNA was constructed from a large (35 kb) paired-end shotgun sequencing library to a high depth of coverage (~30x). The complete genome maps as a 525,671 bp circular molecule and exhibits a fairly conventional set of features including 62 genes (encoding 35 proteins, 24 tRNAs, 3 rRNAs), 22 introns, 3 large repeats (2.7, 9.6, 29 kb), and 96 small repeats (40–293 bp). Most paired-end reads (71%) mapped to the consensus sequence at the expected distance and orientation across the entire genome, validating the accuracy of assembly. Another 10% of reads provided clear evidence of alternative genomic conformations due to apparent rearrangements across large repeats. Quantitative assessment of these repeat-spanning read pairs revealed that all large repeat arrangements are present at appreciable frequencies in vivo, although not always in equimolar amounts. The observed stoichiometric differences for some arrangements are inconsistent with a predominant master circular structure for the mitochondrial genome of M. guttatus IM62. Finally, because IM62 contains a cryptic cytoplasmic male-sterility (CMS) system, an in silico search for potential CMS genes was undertaken. The three chimeric ORFs identified in this study, in addition to the previously identified ORFs upstream of the nad6 gene, are the most likely CMS candidate genes in this line.

Includes Supplementary Information.