A distance measure for DNA sequences

Mark Timothy Bauer, University of Nebraska - Lincoln


In this dissertation I investigate how the Average Mutual Information profile could be used to provide a useful distance measure between two DNA Sequences. This distance measure has the advantage over other measures by being only minimally affected by errors in the genomic sequence data. The distance measure used here is shown to be an effective method for classification of sequences into groups or the identification of an unknown sequence as to belonging to a particular group. This is demonstrated using several examples of chromosomes from different species as well as chromosomes from the same species. The distance measure is then used to generate two dimensional and three dimensional mappings to visualize relationships between sequences. The distance measure is also used to generate UPGMA trees showing possible genetic relations between groups of sequences. ^

Subject Area

Biology, Genetics

