Ítem


A bootstrap estimation scheme for chemical compositional data with nondetects

The bootstrap method is commonly used to estimate the distribution of estimators and their associated uncertainty when explicit analytic expressions are not available or are difficult to obtain. It has been widely applied in environmental and geochemical studies, where the data generated often represent parts of whole, typically chemical concentrations. This kind of constrained data is generically called compositional data, and they require specialised statistical methods to properly account for their particular covariance structure. On the other hand, it is not unusual in practice that those data contain labels denoting nondetects, that is, concentrations falling below detection limits. Nondetects impede the implementation of the bootstrap and represent an additional source of uncertainty that must be taken into account. In this work, a bootstrap scheme is devised that handles nondetects by adding an imputation step within the resampling process and conveniently propagates their associated uncertainly. In doing so, it considers the constrained relationships between chemical concentrations originated from their compositional nature. Bootstrap estimates using a range of imputation methods, including new stochastic proposals, are compared across scenarios of increasing difficulty. They are formulated to meet compositional principles following the log-ratio approach, and an adjustment is introduced in the multivariate case to deal with nonclosed samples. Results suggest that nondetect bootstrap based on model-based imputation is generally preferable. A robust approach based on isometric log-ratio transformations appears to be particularly suited in this context. Computer routines in the R statistical programming language are provided

This research has been supported by the Scottish Government’s Rural and Environment Science and Analytical Services Division (RESAS), the Spanish Ministry of Economy and Competitiveness under the project ’METRICS’ Ref. MTM2012-33236 and the Agencia de Gestio d’Ajuts Universitaris i de Recerca of the Generalitat de Catalunya under the project Ref. 2009SGR424

© Journal of Chemometrics, 2014, vol. 28, núm. 7, p. 585-599

Wiley

Autor: Palarea Albaladejo, Javier
Martín Fernández, Josep Antoni
Olea, Ricardo A.
Data: juliol 2014
Resum: The bootstrap method is commonly used to estimate the distribution of estimators and their associated uncertainty when explicit analytic expressions are not available or are difficult to obtain. It has been widely applied in environmental and geochemical studies, where the data generated often represent parts of whole, typically chemical concentrations. This kind of constrained data is generically called compositional data, and they require specialised statistical methods to properly account for their particular covariance structure. On the other hand, it is not unusual in practice that those data contain labels denoting nondetects, that is, concentrations falling below detection limits. Nondetects impede the implementation of the bootstrap and represent an additional source of uncertainty that must be taken into account. In this work, a bootstrap scheme is devised that handles nondetects by adding an imputation step within the resampling process and conveniently propagates their associated uncertainly. In doing so, it considers the constrained relationships between chemical concentrations originated from their compositional nature. Bootstrap estimates using a range of imputation methods, including new stochastic proposals, are compared across scenarios of increasing difficulty. They are formulated to meet compositional principles following the log-ratio approach, and an adjustment is introduced in the multivariate case to deal with nonclosed samples. Results suggest that nondetect bootstrap based on model-based imputation is generally preferable. A robust approach based on isometric log-ratio transformations appears to be particularly suited in this context. Computer routines in the R statistical programming language are provided
This research has been supported by the Scottish Government’s Rural and Environment Science and Analytical Services Division (RESAS), the Spanish Ministry of Economy and Competitiveness under the project ’METRICS’ Ref. MTM2012-33236 and the Agencia de Gestio d’Ajuts Universitaris i de Recerca of the Generalitat de Catalunya under the project Ref. 2009SGR424
Format: application/pdf
ISSN: 0886-9383 (versió paper)
1099-128X (versió electrònica)
Accés al document: http://hdl.handle.net/10256/10928
Llenguatge: eng
Editor: Wiley
Col·lecció: MINECO/PN 2013-2015/MTM2012-33236
AGAUR/2009-2014/2009 SGR-424
Reproducció digital del document publicat a: http://dx.doi.org/10.1002/cem.2621
Articles publicats (D-IMA)
És part de: © Journal of Chemometrics, 2014, vol. 28, núm. 7, p. 585-599
Drets: Tots els drets reservats
Matèria: Anàlisi multivariable
Multivariate analysis
Distribució (Teoria de la probabilitat)
Distribution (Probability theory)
Estimació, Teoria de l’
Estimation theory
Estadística matemàtica
Mathematical statistics
Títol: A bootstrap estimation scheme for chemical compositional data with nondetects
Tipus: info:eu-repo/semantics/article
Repositori: DUGiDocs

Matèries

Autors