Item


Use of Survey Weights for the Analysis of Compositional Data: Some Simulation Results

The compositional space can be seen as a vector space, where the vector addition corresponds toperturbation and the multiplication by a scalar corresponds to powering (Aitchison, 1986; PawlowskyGlahnand Egozcue, 2001). Whereas perturbation is a widely used operation in applications of compositionalanalysis, powering is somewhat neglected. Survey data analysis on the other hand is a domainof applied statistics where the use of weights is predominant. The reason for introducing weights insurvey data analysis is threefold: 1. the use of complex survey designs with unequal inclusion probabilities,2. the correction of non-response, and 3. calibration procedures. We shall introduce brieflythe rationale for weights in survey analysis and then discuss the connection between survey weightsand the powering operation. Several examples will be given.Surveys are essentially built to optimize the estimation of totals in population subgroups for anumber of variables. Practically, a key variable is chosen and the design is optimized for this variable,the trade-off being between cost and precision. Totals are estimated by weighted sums of the sampledvalues. The weights are extrapolation factors that depend on the survey design. It is an importantaspect of the data quality to inform the user on the measurement error of the published figures. Surveydesign and estimation are described e.g. in S¨arndal, Swensson and Wretman (1992).In a survey context, the interest is taken in totals or means across cases, but in a compositionalcontext, totals have no meaning. So if we want to average cases, we have to go back to the originalmeasurement scale and then make the closure operation. For the geometric mean composition on thecontrary, the result is the same, whether the amounts are averaged first and then a average compositionis computed, or whether the geometric mean of the compositions is computed directly and then closed.The design-based approach does not make any assumptions on the distribution of compositions.This opens the way to parametrization by general partitions (Aitchison, 1986, section 2.7) withoutthe drawback of ad hoc assumptions on multivariate normality (Aitchison, 1986, definition 6.7). Inhousehold expenditure surveys for instance, a hierarchy of commodities with broad categories are subdividedinto more detailed goods. A general partition can follow this organization and may be a moreconvenient way to convey the information on the surveyed units. The joint probability distributionof transforms of this general partition is derived from the distribution of the sample inclusion indicator.After a brief review of survey methodology, we apply the design-based principles to the estimationof compositions, of compositional transforms and of their covariance matrix on a small population.The properties of the estimators will be investigated by simulation. The talk will end with a discussion

Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada

Other contributions: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Author: Graf, Monique
Abstract: The compositional space can be seen as a vector space, where the vector addition corresponds toperturbation and the multiplication by a scalar corresponds to powering (Aitchison, 1986; PawlowskyGlahnand Egozcue, 2001). Whereas perturbation is a widely used operation in applications of compositionalanalysis, powering is somewhat neglected. Survey data analysis on the other hand is a domainof applied statistics where the use of weights is predominant. The reason for introducing weights insurvey data analysis is threefold: 1. the use of complex survey designs with unequal inclusion probabilities,2. the correction of non-response, and 3. calibration procedures. We shall introduce brieflythe rationale for weights in survey analysis and then discuss the connection between survey weightsand the powering operation. Several examples will be given.Surveys are essentially built to optimize the estimation of totals in population subgroups for anumber of variables. Practically, a key variable is chosen and the design is optimized for this variable,the trade-off being between cost and precision. Totals are estimated by weighted sums of the sampledvalues. The weights are extrapolation factors that depend on the survey design. It is an importantaspect of the data quality to inform the user on the measurement error of the published figures. Surveydesign and estimation are described e.g. in S¨arndal, Swensson and Wretman (1992).In a survey context, the interest is taken in totals or means across cases, but in a compositionalcontext, totals have no meaning. So if we want to average cases, we have to go back to the originalmeasurement scale and then make the closure operation. For the geometric mean composition on thecontrary, the result is the same, whether the amounts are averaged first and then a average compositionis computed, or whether the geometric mean of the compositions is computed directly and then closed.The design-based approach does not make any assumptions on the distribution of compositions.This opens the way to parametrization by general partitions (Aitchison, 1986, section 2.7) withoutthe drawback of ad hoc assumptions on multivariate normality (Aitchison, 1986, definition 6.7). Inhousehold expenditure surveys for instance, a hierarchy of commodities with broad categories are subdividedinto more detailed goods. A general partition can follow this organization and may be a moreconvenient way to convey the information on the surveyed units. The joint probability distributionof transforms of this general partition is derived from the distribution of the sample inclusion indicator.After a brief review of survey methodology, we apply the design-based principles to the estimationof compositions, of compositional transforms and of their covariance matrix on a small population.The properties of the estimators will be investigated by simulation. The talk will end with a discussion
Document access: http://hdl.handle.net/2072/273438
Language: eng
Publisher: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Rights: Tots els drets reservats
Subject: Anàlisi multivariable -- Congressos
Multivariate analysis -- Congresses
Estadística matemàtica -- Congressos
Mathematical statistics -- Congresses
Estimació, Teoria de l’ -- Congressos
Estimation theory -- Congresses
Title: Use of Survey Weights for the Analysis of Compositional Data: Some Simulation Results
Type: info:eu-repo/semantics/conferenceObject
Repository: Recercat

Subjects

Authors


Warning: Unknown: write failed: No space left on device (28) in Unknown on line 0

Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php5) in Unknown on line 0