Ítem


Markov chain montecarlo method applied to rounding zeros of compositional data: first approach

As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completely absent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involved parts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method is introduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that the theoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approach has reasonable properties from a compositional point of view. In particular, it is “natural” in the sense that it recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in the same paper a substitution method for missing values on compositional data sets is introduced

Geologische Vereinigung; Universitat de Barcelona, Equip de Recerca Arqueomètrica; Institut d’Estadística de Catalunya; International Association for Mathematical Geology; Patronat de l’Escola Politècnica Superior de la Universitat de Girona; Fundació privada: Girona, Universitat i Futur.

Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada

Director: Martín Fernández, Josep Antoni
Thió i Fernández de Henestrosa, Santiago
Altres contribucions: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Autor: Martín Fernández, Josep Antoni
Palarea Albaladejo, Javier
Gómez García, Juan
Data: 15 octubre 2003
Resum: As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completely absent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involved parts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method is introduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that the theoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approach has reasonable properties from a compositional point of view. In particular, it is “natural” in the sense that it recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in the same paper a substitution method for missing values on compositional data sets is introduced
Geologische Vereinigung; Universitat de Barcelona, Equip de Recerca Arqueomètrica; Institut d’Estadística de Catalunya; International Association for Mathematical Geology; Patronat de l’Escola Politècnica Superior de la Universitat de Girona; Fundació privada: Girona, Universitat i Futur.
Format: application/pdf
Cita: Martín Fernández, J.A.; Palarea Albaladejo, J.; Gómez García, J. ’Markov chain montecarlo method applied to rounding zeros of compositional data: approach’ a CODAWORK’03. Girona: La Universitat, 2003 [consulta: 5 maig 2008]. Necessita Adobe Acrobat. Disponible a Internet a: http://hdl.handle.net/10256/663
ISBN: 84-8458-111-X
Accés al document: http://hdl.handle.net/10256/663
Llenguatge: eng
Editor: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Drets: Tots els drets reservats
Matèria: Anàlisi numèrica
Montecarlo, Mètode de
Processos de Markov
Títol: Markov chain montecarlo method applied to rounding zeros of compositional data: first approach
Tipus: info:eu-repo/semantics/conferenceObject
Repositori: DUGiDocs

Matèries

Autors