Computer Science and Engineering, Department of


Document Type


Date of this Version



Analytical and Bioanalytical Chemistry.


Used by permission.


Comprehensive two-dimensional gas chromatography with time-of-fight mass spectrometry (GC×GC-TOFMS) is one the most powerful analytical platforms for chemical investigations of complex biological samples. It produces large datasets that are rich in information, but highly complex, and its consistency may be affected by random systemic fluctuations and/ or changes in the experimental parameters. This study details the optimization of a data processing strategy that compensates for severe 2D pattern misalignments and detector response fluctuations for saliva samples analyzed across 2 years. The strategy was trained on two batches: one with samples from healthy subjects who had undergone dietary intervention with high/low-Maillard reaction products (dataset A), and the second from healthy/unhealthy obese individuals (dataset B). The combined untargeted and targeted pattern recognition algorithm (i.e., UT fingerprinting) was tuned for key process parameters, the signal-to-noise ratio (S/N), and MS spectrum similarity thresholds, and then tested for the best transform function (global or local, affine or low-degree polynomial) for pattern realignment in the temporal domain. Reliable peak detection achieved its best performance, computed as % of false negative/positive matches, with a S/N threshold of 50 and spectral similarity direct match factor (DMF) of 700. Cross-alignment of bi-dimensional (2D) peaks in the temporal domain was fully effective with a supervised operation including multiple centroids (reference peaks) and a match-and-transform strategy using affine functions. Regarding the performance-derived response fluctuations, the most promising strategy for cross-comparative analysis and data fusion included the mass spectral total useful signal (MSTUS) approach followed by Z-score normalization on the resulting matrix.