Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.

Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Sequence comparison and stochastic model based on multiorder Markov models

Xiang Fang, University of Nebraska - Lincoln

Abstract

This dissertation presents two statistical methodologies developed on multi-order Markov models. First, we introduce an alignment-free sequence comparison method, which represents a sequence using a multi-order transition matrix (MTM). The MTM contains information of multi-order dependencies and provides a comprehensive representation of the heterogeneous composition within a sequence. Based on the MTM, a distance measure is developed for pair-wise comparison of sequences. The new method is compared with the traditional maximum likelihood (ML) method, the complete composition vector (CCV) method and the improved version of the complete composition vector (ICCV) method using simulated sequences. We further illustrate the application of the MTM method using two real data sets, influenza A virus hemagglutinin gene sequence and complete mitochondrial genome sequences. We then present a stochastic model named Multi-Order Markov Model under Hidden States (MMMHS) for representing heterogeneous sequences. MMMHS is similar to the conventional Hidden Markov Model (HMM) and Double Chain Markov Model (DCMM) in terms of using hidden states to describe the non-homogeneity of a sequence, but it provides a more flexible dependency structure by changing the order of Markov dependency under different hidden states. We extend the forward-backward procedure to MMMHS and provide the complete model estimation procedure based on Expectation-Maximization (EM) algorithm. The method is then illustrated with applications on several real data sets, and the results are compared with that of traditional methods.

Subject Area

Statistics

Recommended Citation

Fang, Xiang, "Sequence comparison and stochastic model based on multiorder Markov models" (2009). ETD collection for University of Nebraska-Lincoln. AAI3386580.
https://digitalcommons.unl.edu/dissertations/AAI3386580

Share

COinS