Loughborough University
Browse

Data processing methods for untargeted volatile atmospheric pressure chemical ionisation-mass spectrometry analysis

Download (7.45 MB)
thesis
posted on 2024-07-03, 10:53 authored by Kerry Rosenthal

Untargeted direct mass spectrometry analysis of volatile organic compounds (VOCs) has many potential applications across fields such as healthcare and food safety. Volatile atmospheric pressure chemical ionisation coupled to a compact mass spectrometer (vAPCI-MS) is a recently developed, versatile, and transportable analytical technique, that has the potential to be used in clinical and industry settings to rapidly and non-invasively detect a wide range of relevant compounds in gasphase samples, such as breath and cell culture headspace. However, robust data processing protocols must be employed to ensure that new research is replicable, and the practical applications can be realised.

Chapter 1 reviews methodology for analysing VOCs including data collection, processing, and analysis. Mass spectrometry is the most commonly used method for untargeted analysis of VOCs; however, a large range of data processing methods are available. Chapter 2 systematically reviews data processing and analytic workflows currently in use for direct mass spectrometry analysis of VOCs and examines whether methodological reporting is sufficient to enable replication. From 459 studies identified from the databases, a total of 110 met the inclusion criteria. Very few papers provided enough detail to allow all aspects of the methods used to be replicated accurately, with only three papers meeting previous guidelines for reporting experimental methods. A wide range of data processing methods were used, with only eight papers (7.3%) employing a largely similar workflow where direct comparability was achievable. Standardised workflows and reporting systems need to be developed to ensure research in this area is replicable, comparable, and held to a high standard. Thus, allowing the wide-ranging potential applications to be realised.

Chapter 3 describes the mass spectrometry instrumentation used in this thesis and describes how the data was binned to 1 Da. Additionally, this chapter describes the main data processing methodology used in this thesis. A random resampling procedure for reducing the impact of instrumental noise was developed, with 200 random resamples deemed optimal for accurately predicting sample type. Chapter 4 further tests this methodology on a common sample type, bacterial culture headspace. Identifying the characteristics of bacterial species can improve treatment outcomes and mass spectrometry methods have been shown to be capable of identifying biomarkers of bacterial species. This study was the first to use vAPCI-MS to directly and noninvasively analyse the headspace of E. coli and S. aureus bacterial cultures, enabling major biological classification at species level (Gram negative/positive, respectively). Four different protocols were used to collect data, three utilising discrete 5-minute samples taken between 2 and 96 hours after inoculation and one method employing 24-hour continuous sampling. Characteristic marker ions were found for both E. coli and S. aureus. A model to distinguish between sample types was able to correctly identify the bacteria samples after sufficient growth (24–48 hours), with similar results obtained across different sampling methods. This demonstrates that this is a robust method to analyse and classify bacterial cultures accurately and within a relevant time frame, offering a promising technique for both clinical and research applications.

Chapters 5 and 6 both aimed to optimise data processing protocols for four different methods of breath sampling, another commonly analysed sample type in metabolomics. Three breath sample datasets were examined in Chapter 5: a peppermint washout study using continuous breath sampling with a purified air source; an exercise study, with samples taken before and after an incremental maximal oxygen utilisation test, using continuous breath sampling with an ambient air source; and a single breath study, using Haldane tubes to collect samples. In Chapter 5, each dataset was processed using different breath selection methods which were compared and benchmarked according to predictive performance on a validation set and quantitative reliability of each m/z window intensity measurements. For both continuous methods, the breath selection methods were used in combination with the random resampling procedure. The best breath selection method improved the predictive model compared to no preselection, as measured by the 95% confidence interval (CI) range for Youden’s index, from 0.68–0.86 to 0.86–0.97 for the exercise study and from 0.69–0.82 to 1.00–1.00 for the peppermint study. The median reliability of intensity measurements for both continuous datasets (as measured by median relative standard deviation (RSD)), was improved slightly by the best selection method compared to no preselection, from 18% to 14% for the exercise study and 7%–5% for the peppermint study. For the single breath samples collected using Haldane tubes, the median reliability of the proposed method was 38%, for samples from the same participant collected during the same sampling session.

Chapter 6 tested the use of background subtraction and a quality control correction using an acetoneD6 permeation source on single breath samples collected using Bio-VOC tubes, alongside different breath selection methods. Participants provided resting samples on four separate days and performed an incremental maximal oxygen utilisation test on one of the days. The repeatability of the breath acetone measurements across days in this study was much worse than previous studies, with median RSDs ranging from 39–73% depending on the data processing methods. Generally, acetone-D6 correction resulted in lower RSDs than no quality control correction, and no background subtraction lower than background subtraction. A comparison of pre- versus post-exercise samples replicated the finding from Chapter 5, that high intensity exercise does not have a consistent effect on breath acetone across participants.

The application of appropriate data processing methods can improve the quality of the data and results obtained from vAPCI-MS. This includes binning the data, the random resampling procedure, breath selection methods, and quality control correction, which can affect the reliability of the measurement and the ability to distinguish between breath samples taken under different conditions. Further research is required to fully develop a data processing workflow for vAPCI-MS, but the methods developed in this thesis allow for accurate and replicable analysis of VOCs.

History

School

  • Sport, Exercise and Health Sciences

Publisher

Loughborough University

Rights holder

© Kerry Rosenthal

Publication date

2024

Notes

Submitted in partial fulfilment of the requirements for the award of Doctor of Philosophy of Loughborough University

Language

  • en

Supervisor(s)

Martin Lindley ; Eugenie Hunsicker ; Matthew Turner ; Elizabeth Ratcliffe

Qualification name

  • PhD

This submission includes a signed certificate in addition to the thesis file(s)

  • I have submitted a signed certificate

Usage metrics

    Sport, Exercise and Health Sciences Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC