Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.
Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Information-theoretic mass spectral library search for comprehensive two-dimensional gas chromatography with mass spectrometry
Abstract
Comprehensive Two-Dimensional Gas Chromatography with Mass Spectrometry (GCxGC-MS) combines two techniques providing increased separation capacity and enhanced capability for chemical identification. One of the most important methods for chemical identification is library search, which searches for an unknown mass spectrum in a library of known mass spectra to produce a list of potential matches ordered by match quality. Applications of compound identification include environmental monitoring, forensics, security, food and medicine. This dissertation presents a new information-theoretic mass spectral library search technique for compound identification in GCxGC-MS and other MS applications. The method is based on a similarity measure between an unknown spectrum and a library spectrum involving the probability distribution functions of the intensities in the library and the noise in the data. The new method characterizes the library with an array of probability distribution functions of intensities as a function of mass-to-charge ratio. Each probability in the distribution function characterizes the fraction of spectra in the library having that intensity value at the given mass-to-charge ratio. The instrument noise is modelled with parameters estimated by statistically analyzing within individual GCxGC-MS peaks the intensity variations at each mass-to-charge ratio. Experimental results demonstrate the effectiveness and robustness of the new information-theoretic mass spectral library search technique. In simulation experiments, random spectra from the NIST/EPA/NIH Mass Spectral Library were corrupted with synthetic noise to generate random test spectra. Then, the corrupted spectra were submitted as unknowns for the library search using different search techniques. Experiments evaluated search performance with additive signal-independent noise, signal-dependent noise, (Johnson) colored noise, and spectral noise (from another spectrum selected randomly from the library). Other experiments evaluated search performance for real GCxGC-MS data. Search techniques were evaluated for many trials under each experimental condition by the Average Rank of the correct match in the ordered list of potential matches returned by the respective search techniques. The new information-theoretic mass spectral library search technique performs better than NIST MS Search and Probability Based Matching (PBM) for all noise models; that is, the new search technique ranked the correct spectrum higher in the ordered list of potential matches than NIST MS Search and PBM for all noise models. In experiments with real data from GCxGC-Time-of-Flight-MS instruments and GCxGC-Quadrupole-MS instruments, the noise parameters were estimated by statistical analysis of mass spectral variations in multiple spectra of GCxGC-MS peaks and the weighted mean spectra of the peaks (added to the library as the correct match). In the experiments with real data, the information-theoretic mass spectral library search technique worked better than NIST MS Search and PBM in most cases. Keywords. information theory, library search, compound identification, mass spectrum, noise model, similarity measure.
Subject Area
Organic chemistry|Computer science
Recommended Citation
Visvanathan, Arvind, "Information-theoretic mass spectral library search for comprehensive two-dimensional gas chromatography with mass spectrometry" (2008). ETD collection for University of Nebraska-Lincoln. AAI3315058.
https://digitalcommons.unl.edu/dissertations/AAI3315058