Instructors: Fabio Crestani
Workload: 6 ECTS
This course is an applied statistics course focusing on data analysis. The course begins with an overview of how to organise, perform, and write-up data analyses. Then it covers some of the most popular and widely used statistical methods like linear regression, principal components analysis, cross-validation, and p-values. Instead of focusing on mathematical details, the lectures are designed to help you apply these techniques to real data using the R statistical programming language, interpret and visualise the results, and diagnose potential problems in your analysis. In the last part of the course we also cover different database options and emerging technologies such as MapReduce and Hadoop.