Item


News from compositions, the R package

The R-package “compositions”is a tool for advanced compositional analysis. Its basic functionality has seen some conceptual improvement, containing now some facilities to work with and represent ilr bases built from balances, and an elaborated subsys- tem for dealing with several kinds of irregular data: (rounded or structural) zeroes, incomplete observations and outliers. The general approach to these irregularities is based on subcompositions: for an irregular datum, one can distinguish a “regular” sub- composition (where all parts are actually observed and the datum behaves typically) and a “problematic” subcomposition (with those unobserved, zero or rounded parts, or else where the datum shows an erratic or atypical behaviour). Systematic classification schemes are proposed for both outliers and missing values (including zeros) focusing on the nature of irregularities in the datum subcomposition(s). To compute statistics with values missing at random and structural zeros, a projection approach is implemented: a given datum contributes to the estimation of the desired parameters only on the subcompositon where it was observed. For data sets with values below the detection limit, two different approaches are provided: the well-known imputation technique, and also the projection approach. To compute statistics in the presence of outliers, robust statistics are adapted to the characteristics of compositional data, based on the minimum covariance determinant approach. The outlier classification is based on four different models of outlier occur- rence and Monte-Carlo-based tests for their characterization. Furthermore the package provides special plots helping to understand the nature of outliers in the dataset. Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator, robustness, rounded zeros

Geologische Vereinigung; Institut d’Estadística de Catalunya; International Association for Mathematical Geology; Càtedra Lluís Santaló d’Aplicacions de la Matemàtica; Generalitat de Catalunya, Departament d’Innovació, Universitats i Recerca; Ministerio de Educación y Ciencia; Ingenio 2010.

Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada

Manager: Daunis i Estadella, Josep
Martín Fernández, Josep Antoni
Other contributions: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Author: Bren, Matevž
Tolosana Delgado, Raimon
Boogaart, K. Gerald van den
Date: 2008 May 27
Abstract: The R-package “compositions”is a tool for advanced compositional analysis. Its basic functionality has seen some conceptual improvement, containing now some facilities to work with and represent ilr bases built from balances, and an elaborated subsys- tem for dealing with several kinds of irregular data: (rounded or structural) zeroes, incomplete observations and outliers. The general approach to these irregularities is based on subcompositions: for an irregular datum, one can distinguish a “regular” sub- composition (where all parts are actually observed and the datum behaves typically) and a “problematic” subcomposition (with those unobserved, zero or rounded parts, or else where the datum shows an erratic or atypical behaviour). Systematic classification schemes are proposed for both outliers and missing values (including zeros) focusing on the nature of irregularities in the datum subcomposition(s). To compute statistics with values missing at random and structural zeros, a projection approach is implemented: a given datum contributes to the estimation of the desired parameters only on the subcompositon where it was observed. For data sets with values below the detection limit, two different approaches are provided: the well-known imputation technique, and also the projection approach. To compute statistics in the presence of outliers, robust statistics are adapted to the characteristics of compositional data, based on the minimum covariance determinant approach. The outlier classification is based on four different models of outlier occur- rence and Monte-Carlo-based tests for their characterization. Furthermore the package provides special plots helping to understand the nature of outliers in the dataset. Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator, robustness, rounded zeros
Geologische Vereinigung; Institut d’Estadística de Catalunya; International Association for Mathematical Geology; Càtedra Lluís Santaló d’Aplicacions de la Matemàtica; Generalitat de Catalunya, Departament d’Innovació, Universitats i Recerca; Ministerio de Educación y Ciencia; Ingenio 2010.
Format: application/pdf
Citation: Bren, M.; Tolosana Delgado, R.; Boogaart, K.G. ’News from compositions, the R package’ a CODAWORK’08. Girona: La Universitat, 2008 [consulta: 12 maig 2008]. Necessita Adobe Acrobat. Disponible a Internet a: http://hdl.handle.net/10256/716
Document access: http://hdl.handle.net/10256/716
Language: eng
Publisher: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Rights: Tots els drets reservats
Subject: Estadística matemàtica -- Informàtica
Title: News from compositions, the R package
Type: info:eu-repo/semantics/conferenceObject
Repository: DUGiDocs

Subjects

Authors