U.S. Department of Defense


Date of this Version



LNCS 4848, pp. 146–151, 2008


© Springer-Verlag Berlin Heidelberg 2008

This document is a U.S. government work and is not subject to copyright in the United States.


DNA codes consisting of DNA sequences are necessary for DNA computing. The minimum distance parameter of such codes is a measure of how dissimilar the codewords are, and thus is indirectly a measure of the likelihood of undetectedable or uncorrectable errors occurring during hybridization. To compute distance, an abstract metric, for example, longest common subsequence, must be used to model the actual bonding energies of DNA strands. In this paper we continue the development [1,2,3] of similarity functions for q-ary n-sequences The theoretical lower bound on the maximal possible size of codes, built on the space endowed with this metric, is obtained. that can be used (for q = 4) to model a thermodynamic similarity on DNA sequences. We introduce the concept of a stem similarity function and discuss DNA codes [2] based on the stem similarity. We suggest an optimal construction [2] and obtain random coding bounds on the maximum size and rate for such codes.