Ítem


Non-detect Bootstrap Method for Estimating Distributional Parameters of Compositional Samples Revisited: a Multivariate Approach

Bootstrap resampling is an attractive, computationally-intensive approach for estimating population parameters and their associated uncertainties. Values below detection limit—also referred to as non-detects—frequently arise particularly when dealing with multivariate geochemical concentrations, making the estimation of distributional parameters—mean, median, percentiles—a difficult challenge. The bootstrap method can be used repeatedly for analyzing resampled versions of the original data set. This way it is possible to estimate univariate distributional parameters while also capturing the additional uncertainty due to missing information. Within this approach, a method must be chosen to substitute non-detects with appropriate values given the compositional nature of the data. This idea was first introduced by Olea (2008) in the previous CoDaWork’08 meeting. Making use of the isometric log-ratio transformation and analyzing one variable at a time, he proposed a univariate bootstrap procedure where the distributional parameters of geochemical components were modeled from bootstrap resamples considering different criteria to impute nondetects. After conducting a sensitivity analysis on both proportion of non-detects and sample size, the study concluded that when drawing randomly a value from the extrapolated tail below the detection limit of the distribution best fitting the complete data—usually the log-normal distribution for geochemical data—the bootstrap estimates turned out to be more accurate than those obtained using simple imputation methods. Rather than analyzing each variable separately, here we make a step further to get the most of the covariance structure of the data set, extending the univariate approach for replacing non-detects to a multivariate setting. As a test bench, a number of data sets containing non-detects are artificially generated from real geochemical data and used to evaluate the performance of different replacement methods within the bootstrap process. First results show improved results when non-detects are replaced by random values drawn from a conditional truncated additive logistic model

Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada

Altres contribucions: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Autor: Palarea Albaladejo, Javier
Martín Fernández, Josep Antoni
Olea, Ricardo A.
Data: 12 maig 2011
Resum: Bootstrap resampling is an attractive, computationally-intensive approach for estimating population parameters and their associated uncertainties. Values below detection limit—also referred to as non-detects—frequently arise particularly when dealing with multivariate geochemical concentrations, making the estimation of distributional parameters—mean, median, percentiles—a difficult challenge. The bootstrap method can be used repeatedly for analyzing resampled versions of the original data set. This way it is possible to estimate univariate distributional parameters while also capturing the additional uncertainty due to missing information. Within this approach, a method must be chosen to substitute non-detects with appropriate values given the compositional nature of the data. This idea was first introduced by Olea (2008) in the previous CoDaWork’08 meeting. Making use of the isometric log-ratio transformation and analyzing one variable at a time, he proposed a univariate bootstrap procedure where the distributional parameters of geochemical components were modeled from bootstrap resamples considering different criteria to impute nondetects. After conducting a sensitivity analysis on both proportion of non-detects and sample size, the study concluded that when drawing randomly a value from the extrapolated tail below the detection limit of the distribution best fitting the complete data—usually the log-normal distribution for geochemical data—the bootstrap estimates turned out to be more accurate than those obtained using simple imputation methods. Rather than analyzing each variable separately, here we make a step further to get the most of the covariance structure of the data set, extending the univariate approach for replacing non-detects to a multivariate setting. As a test bench, a number of data sets containing non-detects are artificially generated from real geochemical data and used to evaluate the performance of different replacement methods within the bootstrap process. First results show improved results when non-detects are replaced by random values drawn from a conditional truncated additive logistic model
Format: application/pdf
Accés al document: http://hdl.handle.net/10256/13638
Llenguatge: eng
Editor: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Col·lecció: CoDaWork 2011. The 4th International Workshop on Compositional Data Analysis
Drets: Tots els drets reservats
Matèria: Estadística matemàtica -- Congressos
Mathematical statistics -- Congresses
Anàlisi multivariable -- Congressos
Multivariate analysis -- Congresses
Estimació de paràmetres -- Congressos
Parameter estimation -- Congresses
Títol: Non-detect Bootstrap Method for Estimating Distributional Parameters of Compositional Samples Revisited: a Multivariate Approach
Tipus: info:eu-repo/semantics/conferenceObject
Repositori: DUGiDocs

Matèries

Autors