Ítem
Daunis-i-Estadella, Pepus
Martín Fernández, Josep Antoni |
|
Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada | |
Bruno, Francesca
Greco, Fedele |
|
30 maig 2008 | |
Our essay aims at studying suitable statistical methods for the clustering of
compositional data in situations where observations are constituted by trajectories of
compositional data, that is, by sequences of composition measurements along a domain.
Observed trajectories are known as “functional data” and several methods have been
proposed for their analysis.
In particular, methods for clustering functional data, known as Functional Cluster
Analysis (FCA), have been applied by practitioners and scientists in many fields. To our
knowledge, FCA techniques have not been extended to cope with the problem of
clustering compositional data trajectories. In order to extend FCA techniques to the
analysis of compositional data, FCA clustering techniques have to be adapted by using a
suitable compositional algebra.
The present work centres on the following question: given a sample of compositional
data trajectories, how can we formulate a segmentation procedure giving homogeneous
classes? To address this problem we follow the steps described below.
First of all we adapt the well-known spline smoothing techniques in order to cope with
the smoothing of compositional data trajectories. In fact, an observed curve can be
thought of as the sum of a smooth part plus some noise due to measurement errors.
Spline smoothing techniques are used to isolate the smooth part of the trajectory:
clustering algorithms are then applied to these smooth curves.
The second step consists in building suitable metrics for measuring the dissimilarity
between trajectories: we propose a metric that accounts for difference in both shape and
level, and a metric accounting for differences in shape only.
A simulation study is performed in order to evaluate the proposed methodologies, using
both hierarchical and partitional clustering algorithm. The quality of the obtained results
is assessed by means of several indices Geologische Vereinigung; Institut d’Estadística de Catalunya; International Association for Mathematical Geology; Càtedra Lluís Santaló d’Aplicacions de la Matemàtica; Generalitat de Catalunya, Departament d’Innovació, Universitats i Recerca; Ministerio de Educación y Ciencia; Ingenio 2010. |
|
application/pdf | |
Bruno, F.; Greco, F. ’Clustering compositional data trajectories’ a CODAWORK’08. Girona: La Universitat, 2008 [consulta: 15 maig 2008]. Necessita Adobe Acrobat. Disponible a Internet a: http://hdl.handle.net/10256/745 | |
http://hdl.handle.net/10256/745 | |
eng | |
Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada | |
Tots els drets reservats | |
Anàlisi matemàtica
Simulació, Mètodes de Anàlisi de conglomerats |
|
Clustering compositional data trajectories | |
info:eu-repo/semantics/conferenceObject | |
DUGiDocs |