Computing, School of

First Advisor

Stephen E. Reichenbach

Date of this Version

8-2011

Document Type

Dissertation

Comments

A dissertation presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of requirements for the degree of Doctor of Philosophy

Major: Computer Science

Under the supervision of Professor Stephen E. Reichenbach. Lincoln, Nebraska, August 2011

Abstract

Mass spectra contain characteristic information regarding the molecular structure and properties of compounds. The mass spectra of compounds from the same chemically related group are similar. Classification is one of the fundamental methodologies for analyzing mass spectral data. The primary goals of classification are to automatically group compounds based on their mass spectra, to find correlation between the properties of compounds and their mass spectra, and to provide a positive identification of unknown compounds.

This dissertation presents a new algorithm for the classification of mass spectra, the most similar neighbor with a probability-based spectrum similarity measure (MSN-PSSM). Experimental results demonstrate the effectiveness and robustness of the new MSN-PSSM algorithm. In leave-one-out cross-validation, it outperforms popular techniques for classification of mass spectra, such as principal component analysis with discriminant function analysis, soft independent modeling of class analogy, and decision tree learning.

Comprehensive two-dimensional chromatography yields highly informative separation patterns because of its great practical peak capacity and sensitivity produced by applying two different separation principles. However, the improvement in information yields complex data requiring comprehensive analyses to interpret the rich information and to extract useful information for characterizing sample composition.

This dissertation presents a new non-targeted cross-sample classification method to analyze comprehensive two-dimensional chromatograms. Experimental results validate the effectiveness of the new non-targeted cross-sample classification. The new non-targeted cross-sample classification is successfully applied to a set of comprehensive two-dimensional chromatograms of breast cancer tumor samples. The feature vectors generated by the new non-targeted cross-sample classification are useful for discriminating between breast cancer tumor samples of different grades and providing information to identify potential biomarkers for closer examination.

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Classification for Mass Spectra and Comprehensive Two-dimensional Chromatograms

First Advisor

Date of this Version

Document Type

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Classification for Mass Spectra and Comprehensive Two-dimensional Chromatograms

Authors

First Advisor

Date of this Version

Document Type

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links