Date of this Version
In the genetic code, the UGA codon has a dual function as it encodes selenocysteine (Sec) and serves as a stop signal. However, only the translation terminator function is used in gene annotation programs, resulting in misannotation of selenoprotein genes. Here, we applied two independent bioinformatics approaches to characterize a selenoprotein set in prokaryotic genomes. One method searched for selenoprotein genes by identifying RNA stem–loop structures, selenocysteine insertion sequence elements; the second approach identified Sec/Cys pairs in homologous sequences. These analyses identified all or almost all selenoproteins in completely sequenced bacterial and archaeal genomes and provided a view on the distribution and composition of prokaryotic selenoproteomes. In addition, lineage-specific and core selenoproteins were detected, which provided insights into the mechanisms of selenoprotein evolution. Characterization of selenoproteomes allows interpretation of other UGA codons in completed genomes of prokaryotes as terminators, addressing the UGA dual-function problem.