Ítem


Statistics for Data Science (using R)

These course notes provide an applied introduction to multivariate data analysis methods and statistical models using the R system for statistical computing. Currently, they are primarily aimed at students of the “Statistics for Data Science” course of the MSc in Data Science of the University of Girona and it serves as basis to more specialised courses taught later on. They build on previous materials developed by the author while delivering training courses for scientists at Biomathematics and Statistics Scotland (BioSS) and lecturing the Multivariate Data Analysis course at the University of Edinburgh. Basic statistical knowledge and some experience working and managing data in the R environment is assumed. The course avoids mathematical/statistical theory as much as possible and concentrates on the underlying concepts, emphasising how to put them in practice using R as computing tool.They are divided into two blocks:Chapters 1-6: overview of some multivariate methods aimed at data dimension reduction, classification, identification of similarities, associations, and patters in data sets; with a focus on data exploration and graphical representation. Chapters 7-12: overview of some of the families of linear, non-linear, generalised linear and additive regression models commonly used in statistical modelling, including questions related to model validation, variable selection and dealing with high dimensions

Universitat de Girona. Departament d’Informàtica, Matemàtica aplicada i Estadística

Autor: Palarea Albaladejo, Javier
Data: setembre 2023
Resum: These course notes provide an applied introduction to multivariate data analysis methods and statistical models using the R system for statistical computing. Currently, they are primarily aimed at students of the “Statistics for Data Science” course of the MSc in Data Science of the University of Girona and it serves as basis to more specialised courses taught later on. They build on previous materials developed by the author while delivering training courses for scientists at Biomathematics and Statistics Scotland (BioSS) and lecturing the Multivariate Data Analysis course at the University of Edinburgh. Basic statistical knowledge and some experience working and managing data in the R environment is assumed. The course avoids mathematical/statistical theory as much as possible and concentrates on the underlying concepts, emphasising how to put them in practice using R as computing tool.They are divided into two blocks:Chapters 1-6: overview of some multivariate methods aimed at data dimension reduction, classification, identification of similarities, associations, and patters in data sets; with a focus on data exploration and graphical representation. Chapters 7-12: overview of some of the families of linear, non-linear, generalised linear and additive regression models commonly used in statistical modelling, including questions related to model validation, variable selection and dealing with high dimensions
Format: application/pdf
Accés al document: http://hdl.handle.net/10256/24695
Llenguatge: eng
Editor: Universitat de Girona. Departament d’Informàtica, Matemàtica aplicada i Estadística
Drets: Tots els drets reservats
Matèria: Estadística -- Informàtica
Statistics -- Data processing
Anàlisi multivariable -- Informàtica
Multivariate analysis -- Data processing
R (Llenguatge de programació)
R (Computer program language)
R (Llenguatge de programació) -- Problemes, exercicis, etc.
R (Computer program language) -- Problems, exercises, etc.
Títol: Statistics for Data Science (using R)
Tipus: lecture notes
Repositori: DUGiDocs

Matèries

Autors