Item


Application of Compositional Models for Glycan HILIC Data

Glycoconjugates constitute a major class of biomolecules which include glycoproteins, glycosphingolipids and proteoglycans. The enzymatic process in which glycans (sugar chains) are linked to proteins or lipids is called glycosylation. Glycosylation is involved in many biological processes, both physiological and pathological, inlcuding host-pathogen interactions, tumour invasion, cell trafficking and signalling. Changes in glycan structure are thought be be at least partly responsible for the development of inflammation, infection, arteriosclerosis, immune defects and autoimmunity. Such changes have been observed in human diseases such as diabetes mellitus, rheumatoid arthritis and Alzheimer’s Disease. Aberrant patterns of glycosylation are also a universal feature of cancer cells. The field of glycobiology thus shows great potential for the discovery of glycan biomarkers for disease diagnosis and prognosis. Here we focus specifically on N-glycans, that is, glycans attached to protein molecules via a nitrogen atom. This class of glycans is the best characterized. High-throughput HILIC analysis is a well-established technique for the separation and quantification of N-linked glycans released from glycoproteins. HILIC analysis quantifies the N-glycan structures in serum via a chromatogram, which is subsequently standardized and integrated. The generated data for each sample is a set of relative HILIC peak areas and as a result, the data is compositional. To-date, most statistical analyses of these glycan data fail to account for their compositional nature. We compare and contrast three compositional data models for the glycan HILIC data: the Dirichlet, Nested Dirichlet and Logistic Normal models, with the intention of providing tools for the statistical analysis of compositional data analysis in the glycobiology field. We use these three models for classification of disease/control cases in ovarian and lung cancer diagnosis applications. We discuss and compare these models in terms of their classification performance and goodness-of-fit

Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada

Other contributions: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Author: Galligan, Marie
Campbell, Matthew P.
Saldova, Radka
Rudd, Pauline M.
Murphy, Thomas Brendan
Date: 2011 May 13
Abstract: Glycoconjugates constitute a major class of biomolecules which include glycoproteins, glycosphingolipids and proteoglycans. The enzymatic process in which glycans (sugar chains) are linked to proteins or lipids is called glycosylation. Glycosylation is involved in many biological processes, both physiological and pathological, inlcuding host-pathogen interactions, tumour invasion, cell trafficking and signalling. Changes in glycan structure are thought be be at least partly responsible for the development of inflammation, infection, arteriosclerosis, immune defects and autoimmunity. Such changes have been observed in human diseases such as diabetes mellitus, rheumatoid arthritis and Alzheimer’s Disease. Aberrant patterns of glycosylation are also a universal feature of cancer cells. The field of glycobiology thus shows great potential for the discovery of glycan biomarkers for disease diagnosis and prognosis. Here we focus specifically on N-glycans, that is, glycans attached to protein molecules via a nitrogen atom. This class of glycans is the best characterized. High-throughput HILIC analysis is a well-established technique for the separation and quantification of N-linked glycans released from glycoproteins. HILIC analysis quantifies the N-glycan structures in serum via a chromatogram, which is subsequently standardized and integrated. The generated data for each sample is a set of relative HILIC peak areas and as a result, the data is compositional. To-date, most statistical analyses of these glycan data fail to account for their compositional nature. We compare and contrast three compositional data models for the glycan HILIC data: the Dirichlet, Nested Dirichlet and Logistic Normal models, with the intention of providing tools for the statistical analysis of compositional data analysis in the glycobiology field. We use these three models for classification of disease/control cases in ovarian and lung cancer diagnosis applications. We discuss and compare these models in terms of their classification performance and goodness-of-fit
Format: application/pdf
Document access: http://hdl.handle.net/10256/13661
Language: eng
Publisher: Universitat de Girona. Departament d’Informàtica i Matemàtica Aplicada
Collection: CoDaWork 2011. The 4th International Workshop on Compositional Data Analysis
Rights: Tots els drets reservats
Subject: Estadística matemàtica -- Congressos
Mathematical statistics -- Congresses
Anàlisi multivariable -- Congressos
Mathematical statistics -- Congresses
Glicoconjugats -- Mètodes estadístics -- Congressos
Glycoconjugates -- Statistical methods -- Congresses
Biomolècules -- Mètodes estadístics -- Congressos
Biomolecules -- Statistical methods -- Congresses
Title: Application of Compositional Models for Glycan HILIC Data
Type: info:eu-repo/semantics/conferenceObject
Repository: DUGiDocs

Subjects

Authors