Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Modeling biological structures via abstract grammars to solve common problems in computational biology

David James Russell, University of Nebraska - Lincoln


Grammars are generally understood to be the set of rules that define the relationships between elements of a language. However, grammars can also be used to elucidate structural relationships within sequences constructed from any finite alphabet. In this work abstract grammars are used to model the primary and secondary structures present in biological data. These grammar models are inferred and applied to efficiently solve various sequence analysis problems in computational biology, including multiple sequence alignment, fragment assembly, database redundancy removal, and structural prediction. ^ The primary structures, or sequential ordering of symbols, of biological data are first modeled with Lempel-Ziv (LZ) grammars. The results are used to construct a grammar based sequence distance metric which can be used to compare biological sequences by comparing their inferred grammars. This concept is applied to solve several problems involving biological sequence analysis including multiple sequence alignment and phylogenetic clustering. The higher-level secondary structures of biological sequences are then modeled via two novel grammar inference methods. The resulting context-free grammars are used to estimate structural pieces within biological sequences, which can in-turn be used as supplemental information to help guide various sequence analysis algorithms. The use of this approach to develop algorithms for various sequence analysis tasks demonstrates the viability and versatility of using abstract grammars to model biological data. ^

Subject Area

Biology, Bioinformatics|Computer Science

Recommended Citation

Russell, David James, "Modeling biological structures via abstract grammars to solve common problems in computational biology" (2010). ETD collection for University of Nebraska - Lincoln. AAI3428272.