Chemistry, Department of

 

Date of this Version

10-15-2015

Citation

Worley, Bradley. Chemometric and Bioinformatic Analyses of Cellular Biochemistry. PhD dissertation, University of Nebraska, October 2015.

Comments

A Dissertation Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy, Major: Chemistry, Under the Supervision of Professor Robert Powers. Lincoln, Nebraska, October 2015

Copyright (c) 2015 Bradley Worley

Abstract

The amount of information collected and analyzed in biochemical and bioanalytical research has exploded over the last few decades, due in large part to the increasing availability of analytical instrumentation that yields information-rich spectra. Datasets from Nuclear Magnetic Resonance (NMR), Mass Spectrometry (MS), infrared (IR) or Raman spectroscopy may easily carry tens to hundreds of thousands of potentially correlated variables observed from only a few samples, making the application of classical statistical methods inappropriate, if not impossible. Drawing useful biochemical conclusions from these unique sources of data requires the use of specialized multivariate data handling techniques.

Unfortunately, proper implementation of many new multivariate algorithms requires domain knowledge in mathematics, statistics, digital signal processing, and software engineering in addition to analytical chemical and biochemical expertise. As a consequence, analysts using multivariate statistical methods were routinely required to chain together multiple commercial software packages and fashion small ad hoc software solutions to interpret a single dataset. This has been especially true in the field of NMR metabolomics, where no single software package, free or otherwise, was capable of completing all operations required to transform raw instrumental data into a set of validated, informative multivariate models. Therefore, while many powerful methods exist in published literature to statistically treat and model multivariate spectral data, few are readily available for immediate use by the community as a whole.

This dissertation describes the development of an end-to-end software solution for the handling and multivariate statistical modeling of spectroscopic data, called MVAPACK, and a set of novel spectral data acquisition, processing and treatment algorithms whose creation was expedited by MVAPACK. A final foray into the potential existence of n-pi* interactions in proteins is also presented.

Advisor: Robert Powers

Share

COinS