Building our Breath Biopsy VOC Atlas®: The First 148 On-Breath VOCs

Our study has identified a list of 148 VOCs using purified chemical standards in a heterogeneous population.

Publication information: Wisenave Arulvasan, Hsuan Chou, Julia Greenwood, Madeleine L. Ball, Owen Birch, Simon Coplowe, Patrick Gordon, Andreea Ratiu, Elizabeth Lam, Ace Hatch, Monika Szkatulska, Steven Levett, Ella Mead, Chloe Charlton-Peel, Louise Nicholson-Scott, Shane Swan, Frederik-Jan van Schooten, Billy Boyle, Max Allsworth. High-quality identification of volatile organic compounds (VOCs) originating from breath. Metabolomics. (2024) 20:120. DOI: 10.1007/s11306-024-02163-6.

Aim: To build a reliable breath collection and analysis method that can produce a comprehensive list of known volatile organic compounds (VOCs) in the breath of a heterogeneous human population.

Summary: 

  • A list of 148 VOCs has been produced and identified using purified chemical standards in a heterogeneous population.
  • Providing confirmed VOC identities that are genuinely breath-borne will facilitate future biomarker discovery and subsequent biomarker validation in clinical studies.

Introduction

Typically, metabolomic studies focus on analyzing substances such as blood, urine, and feces. Exhaled breath is a rich and diverse matrix containing thousands of different volatile organic compounds (VOCs) (1). The non-invasive nature of breath sampling makes it particularly attractive for clinical applications, such as early diagnosis and ongoing longitudinal monitoring.

The validation of clinically useful breath biomarkers remains limited. To advance the field of breath analysis, there is an urgent need to develop a robust platform that can accurately identify the VOCs considered to be genuinely originating from the breath (which are comprised of endogenous VOCs derived from metabolic processes and exogenous VOCs, such as microbiome or dietary compounds). It is important to distinguish these VOCs from background VOCs which are unrelated to underlying physiology.

A general lack of standardization in sampling, analysis, and identification of VOCs means that it is difficult to quickly assign confidence to any single observation without reviewing the underlying literature. In this study, our team presents the development of a novel methodology that combines robust breath and background collection, analytical distinguishing breath VOCs from background contamination, and VOC identification against chemical standards. We demonstrate the capability of this method by presenting a list of high-confidence breath VOCs identified from a heterogeneous human population.

Methods

Study design and subjects

This observational study recruited 99 adult participants who were free of active respiratory infection and fasted for at least two hours before breath sampling. Some volunteers with various chronic diseases such as type 2 diabetes, high blood pressure, and irritable bowel syndrome were included to account for potential normal variation in the population and ensure the breadth of VOC detection.

Breath sampling and analysis

Breath samples were collected using Owlstone Medical’s ReCIVA® Breath Sampler. Background contamination was minimized using the CASPER® Portable Air Supply during breath sampling, which filters ambient air into the ReCIVA. Breath and background samples were analyzed using the Breath Biopsy OMNI workflow using TD-GC-MS. ReCIVA and CASPER breath collection, paired background sampling, TD-GC-MS analysis, and the feature extraction method are collectively known as the ‘OMNI’ method. A total of nine samples were excluded due to either saliva contamination (determined by observation of saliva/bubbles within the tube) (n=8) or incomplete collection volume (n=1).

Feature extraction and data normalization

All features were normalized using the measured peak area intensity response of spiked internal standard (IS) compounds to reduce analytical variability associated with TD-GC-MS. Three metrics were used to compare VOCs in breath samples and paired system background samples. These metrics were the standard deviation (SD) metric, the paired T-test approach metric, and the Receiver Operating characteristic area under the curve (ROC-AUC) metric.

VOC identification using chemical standards

The candidate identities of on-breath VOCs were determined by matching the breath data against the NIST library and cross-checked against the human metabolome database (HMDB) (2). All NIST matches with a similarity index (SI) Match Factor of 500 or higher were considered for confirmation using purified standards.

A certified reference standard (minimum 95% purity) was sourced for each candidate compound and analyzed to generate spectra for matching against on-breath VOCs. Spectra for individual reference standards were identified by deconvolution (using the Thermo Scientific GC Deconvolution plugin) followed by cross-referencing with the NIST library. The final spectrum of each reference standard confirmed the on-breath VOC identities.

Results

Distinguishing on-breath VOCs from the background
voc atlas case study graphic

Three analytical metrics were chosen to provide complimentary insights into the potential composition of on-breath VOCs. Following the analysis of the 90 adult breath samples and paired system backgrounds using the OMNI method, 1471 unique features were present in ≥ 80% of breath samples. A total of 585 VOCs were identified as on-breath using any metric, and of these, the majority were on-breath by all metrics.

Identified VOCs: chemical characteristics

A total of 148 VOCs were able to be assigned identities based on comparisons to reference standards analyzed on the same analytical method in this dataset. A total of 825 purified chemical standards were run to achieve this; 37% of NIST matches with SI scores over 800 were found to be the identity of the on-breath VOC.

While the three metrics have substantial overlap in the VOCs they determine to be on-breath, each metric contributed unique entries to the final pool of identified on-breath VOCs and may be appropriate for different VOCs or study designs.

voc atlas graphic

Discussion

This study presents a list of 148 breath-associated (on-breath) chemically identified VOCs. The integrity of this data relies on stringent criteria for two key aspects: distinguishing VOCs from background contaminants and confirming their chemical identity. The on-breath VOCs presented in this study have been confidently chemically identified using MSI standards and were distinguishable from background contaminants through a robust methodology in a heterogeneous human population.

On-breath VOCs span 45 chemical classes, indicating that they comprise a diverse pool of chemical entities, and 62% have been previously reported in the literature in different biological matrices such as blood, urine, and fecal matter.

Acetone, isoprene, and indole can be used to assess the consistency of the results of this study with breath compositions reported elsewhere, as these are some of the most abundant and commonly identified breath VOCs. (3). All three of these compounds were frequently found to be on-breath within this population, supporting the replicability of this study’s results. Isoprene has been associated previously with a broad range of disease states, as well as being associated with skeletal muscle metabolic activity. (4).

Most breath isoprene is produced through the IDI2 protein that is only present within skeletal-myocellular peroxisomes which supports the observation that breath isoprene increases after exercise (5). This understanding could help establish what useful information could be gained using breath isoprene as a biomarker.

Acetic acid and propionic acid were both found to be on-breath in this study. Both compounds are well-characterized short-chain fatty acids (SCFAs) associated with the gut microbiome. SCFAs are considered exogenous VOCs because they are produced by microbial fermentation of dietary fiber and are thought to diffuse into blood vessels in the gastrointestinal tract, travel via the blood, and enter the breath through alveolar exchange.

The abundance level of SCFAs has been implicated in multiple health contexts, including cancer, neurogenerative disease, and inflammatory bowel disease (IBD) (6–8). SCFAs could therefore serve as breath biomarkers of disease in the future. Their characterization on-breath and the development of reference ranges in a healthy population is essential for this development, which this study provides a useful starting point.

This list of chemically confirmed on-breath VOCs distinguished from background contaminants lays the foundation for the development of our Breath Biopsy VOC Atlas®, an ongoing project to develop a database of chemically identified breath VOCs complete with on-breath status, and quantified reference ranges across different cohorts, including different disease states.

Biological interpretation of VOCs in the breath will significantly help to confidently assign on-breath VOC status, and therefore adding a mechanistic understanding of breath VOCs in the literature is important. We are soon to release a preview of our VOC Atlas = sign up to our waitlist to ensure you are notified as soon as it is live.

References

  1. Haworth JJ, Pitcher CK, Ferrandino G, Hobson AR, Pappan KL, Lawson JLD. Breathing new life into clinical testing and diagnostics: perspectives on volatile biomarkers from breath. Crit Rev Clin Lab Sci. 2022 Aug;59(5):353–72. doi: 10.1080/10408363.2022.2038075
  2. Westhoff M, Friedrich M, Baumbach JI. Simultaneous measurement of inhaled air and exhaled breath by double multicapillary column ion-mobility spectrometry, a new method for breath analysis: results of a feasibility study. ERJ Open Res. 2022 Jan;8(1):00493–2021. doi: 10.1183/23120541.00493-2021
  3. Drabińska N, Flynn C, Ratcliffe N, Belluomo I, Myridakis A, Gould O, et al. A literature survey of all volatiles from healthy human breath and bodily fluids: the human volatilome. J Breath Res. 2021 Apr 21;15(3). doi: 10.1088/1752-7163/abf1d0
  4. Mochalski P, King J, Unterkofler K, Mayhew CA. Unravelling the origin of isoprene in the human body-a forty year Odyssey. J Breath Res. 2024 May 7;18(3). doi: 10.1088/1752-7163/ad4388
  5. Chou H, Arthur K, Shaw E, Schaber C, Boyle B, Allsworth M, et al. Metabolic insights at the finish line: deciphering physiological changes in ultramarathon runners through breath VOC analysis. J Breath Res. 2024 Feb 12;18(2). doi: 10.1088/1752-7163/ad23f5
  6. Duizer C, de Zoete MR. The Role of Microbiota-Derived Metabolites in Colorectal Cancer. Int J Mol Sci. 2023 Apr 28;24(9):8024. doi: 10.3390/ijms24098024
  7. Majumdar A, Siva Venkatesh IP, Basu A. Short-Chain Fatty Acids in the Microbiota-Gut-Brain Axis: Role in Neurodegenerative Disorders and Viral Infections. ACS Chem Neurosci. 2023 Mar 15;14(6):1045–62. doi: 10.1021/acschemneuro.2c00803
  8. Parada Venegas D, De la Fuente MK, Landskron G, González MJ, Quera R, Dijkstra G, et al. Short Chain Fatty Acids (SCFAs)-Mediated Gut Epithelial and Immune Regulation and Its Relevance for Inflammatory Bowel Diseases. Front Immunol. 2019;10:277. doi: 10.3389/fimmu.2019.00277

“High-quality identification of volatile organic compounds (VOCs) originating from breath”. Building the foundations of our soon-to-release Breath Biopsy VOC Atlas®.