Item


It’s all relative: analyzing microbiome data as compositions

Purpose: The ability to properly analyze and interpret large microbiome datasets has lagged behind our ability to acquire such datasets from environmental or clinical samples. Sequencing instruments impose a structure on these data: the natural sample space of a 16S rRNA gene sequencing dataset is a simplex, which is a part of real space that is restricted to non-negative values with a constant sum. Such data are compositional, and should be analyzed using compositionally appropriate tools and approaches. However, the majority of the tools for 16S rRNA gene sequencing analysis assume these data are unrestricted. Methods: We show that existing tools for compositional data (CoDa) analysis can be readily adapted to analyze high throughput sequencing datasets. Results: The Human Microbiome Project tongue vs. buccal mucosa dataset shows how the CoDa approach can address the major elements of microbiome analysis. Reanalysis of a publicly available autism microbiome dataset shows that the CoDa approach in concert with multiple hypothesis test corrections prevent false positive identifications. Conclusions: The CoDa approach is readily scalable to microbiome-sized analyses. We provide example code, and make recommendations to improve the analysis and reporting of microbiome datasets

Work in G.B. Gloor’s lab has been supported by a Discovery Grant from the National Science and Engineering Research Council of Canada. J.R. Wu was supported by a CIHR grant to Dr. J. Allard and GBG. Drs J.J. Egozcue and V. Pawlowsky-Glahn have been supported by the Spanish Ministry of Economy and Competitiveness under the project METRICS (Ref. MTM2012-33236); and by the Agència de Gestió d’Ajuts Universitaris i de Recerca of the Generalitat de Catalunya under the project COSDA (Ref: 2014SGR551).

Elsevier

Manager: Ministerio de Ciencia e Innovación (Espanya)
Generalitat de Catalunya. Agència de Gestió d’Ajuts Universitaris i de Recerca
Author: Gloor, G.B.
Wu, J.R.
Pawlowsky-Glahn, Vera
Egozcue, Juan José
Date: 2016 March 23
Abstract: Purpose: The ability to properly analyze and interpret large microbiome datasets has lagged behind our ability to acquire such datasets from environmental or clinical samples. Sequencing instruments impose a structure on these data: the natural sample space of a 16S rRNA gene sequencing dataset is a simplex, which is a part of real space that is restricted to non-negative values with a constant sum. Such data are compositional, and should be analyzed using compositionally appropriate tools and approaches. However, the majority of the tools for 16S rRNA gene sequencing analysis assume these data are unrestricted. Methods: We show that existing tools for compositional data (CoDa) analysis can be readily adapted to analyze high throughput sequencing datasets. Results: The Human Microbiome Project tongue vs. buccal mucosa dataset shows how the CoDa approach can address the major elements of microbiome analysis. Reanalysis of a publicly available autism microbiome dataset shows that the CoDa approach in concert with multiple hypothesis test corrections prevent false positive identifications. Conclusions: The CoDa approach is readily scalable to microbiome-sized analyses. We provide example code, and make recommendations to improve the analysis and reporting of microbiome datasets
Work in G.B. Gloor’s lab has been supported by a Discovery Grant from the National Science and Engineering Research Council of Canada. J.R. Wu was supported by a CIHR grant to Dr. J. Allard and GBG. Drs J.J. Egozcue and V. Pawlowsky-Glahn have been supported by the Spanish Ministry of Economy and Competitiveness under the project METRICS (Ref. MTM2012-33236); and by the Agència de Gestió d’Ajuts Universitaris i de Recerca of the Generalitat de Catalunya under the project COSDA (Ref: 2014SGR551).
Format: application/pdf
Document access: http://hdl.handle.net/10256/12650
Language: eng
Publisher: Elsevier
Collection: info:eu-repo/semantics/altIdentifier/doi/10.1016/j.annepidem.2016.03.003
info:eu-repo/semantics/altIdentifier/issn/1047-2797
info:eu-repo/grantAgreement/MINECO//MTM2012-33236/ES/METODOS ESTADISTICOS EN ESPACIOS RESTRINGIDOS/
AGAUR/2014-2016/2014 SGR-551
Rights: Reconeixement-NoComercial-SenseObraDerivada 4.0 Internacional
Rights URI: http://creativecommons.org/licenses/by-nc-nd/4.0
Subject: Anàlisi multivariable
Multivariate analysis
Microbiologia -- Mètodes estadístics
Microbiology -- Statistical methods
Title: It’s all relative: analyzing microbiome data as compositions
Type: info:eu-repo/semantics/article
Repository: DUGiDocs

Subjects

Authors