Computing, School of

First Advisor

Stephen D. Scott

Second Advisor

Jitender S. Deogun

Date of this Version

12-2016

Document Type

Dissertation

Comments

A dissertation presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of requirements for the degree of Doctor of Philosophy

Major: Computer Science

Under the supervision of Professors Stephen D. Scott and Jitender S. Deogun. Lincoln, Nebraska, December 2016

Abstract

We address the problem of de novo motif identification. That is, given a set of DNA sequences we try to identify motifs in the dataset without having any prior knowledge about existence of any motifs in the dataset. We propose a method based on Probabilistic Suffix Trees (PSTs) to identify fixed-length motifs from a given set of DNA sequences. Our experiments reveal that our approach successfully discovers true motifs. We compared our method with the popular MEME algorithm, and observed that it detects a larger number of correct and statistically significant motifs than MEME. Our method is highly efficient as compared to MEME in finding the motifs when processing datasets of 1000 or more sequences. We applied our method to sequences of mutant strains of Exophiala dermatitidis and successfully identified motifs which revealed several transcription factor binding sites. This information is important to biologists for performing experiments to understand their role in different regulatory pathways affected by cdc42. We also show that our PST approach to de novo motif discovery can be used successfully to identify motifs in ChIP-Seq datasets. These motifs in turn identify binding sites for proteins in the sequences.

Download

Included in

Computer Engineering Commons

COinS

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Finding DNA Motifs: A Probabilistic Suffix Tree Approach

First Advisor

Second Advisor

Date of this Version

Document Type

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Finding DNA Motifs: A Probabilistic Suffix Tree Approach

Authors

First Advisor

Second Advisor

Date of this Version

Document Type

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links