Computer Science and Engineering, Department of


Date of this Version



A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science, Major: Computer Science, Under the Supervision of Professors Peter Z. Revesz and Jean-Jack M. Riethoven. Lincoln, Nebraska: April, 2011
Copyright 2011 Dipty Singh


The DNA molecules packaged in structures called chromosomes within the cells of living organisms encode hereditary information that is passed on to their offspring. Using transcription and translation, the genes within these DNA molecules help in protein synthesis. Thus chromosomal DNA serves as a blueprint for the chemical processes of life.

In order to analyze a DNA sequence by currently available technology, we have to cut it into small fragments, e.g. by using restriction enzymes. The application of different restriction enzymes to the multiple copies of the same DNA sequence generates many overlapping fragments. In order to construct the original DNA, these fragments need to be sequenced and assembled. This problem of finding the original order of the fragments is called the genome map assembly problem.

This research proposes a constraint automaton solution to solve the genome map assembly problem for both error prone and error free data. Plasmid vectors puc57, pKLAC1-malE, pTXB1 and phage vector Adenovirus2, having a size in base pairs of 2710, 6706, 10153 and 35937 respectively, were used to prove that computational time for solving genome map assembly problem using constraint automaton solution is linear with both precise and approximate data.