Info
Schedule:
- Tuesday 10.15-11.45 - Lecture, Narva mnt 18 - 1007 (or online, see Moodle for updates)
- Wednesday 10.15-11.45 - Computer Lab, Narva mnt 18 - 2004 (or online, see Moodle for updates)
Moodle page of the course: https://moodle.ut.ee/course/view.php?id=5843
Amount of credits: 6 ECTS (EAP)
Course code: MTMS.01.011
Lecturer: Meelis Käärik (associate professor, Institute of Mathematics and Statistics, University of Tartu)
Target group: master students of actuarial and financial engineering / mathematics and statistics programmes
Recommended prerequisites:
- MTMS.01.001 Mathematical Statistics I (6 ECTS)
- MTMS.01.008 Matrix Calculus for Statistics (3 ECTS)
- MTMS.01.071 Linear Models (6 ECTS)
- MTMS.01.035 Mathematical Statistics II (6 ECTS)
- MTMS.01.007 Data Analysis II (6 ECTS)
Brief description: Generalized linear model (GLM) is a flexible generalization of classical linear regression, where the distribution of response variable is not restricted to normal distribution. By introducing the exponential family of distributions, GLMs gather a wide range of models under one maximum likelihood estimation framework.
GLMs can be considered as the class of "most advanced simple models": they go beyond the complexity of ordinary linear models, but retain the interpretability property. Thus, they serve as a benchmark for most machine learning problems, where non-linear relations are modeled and the interpretability is no longer clear nor simple.
The following topics will be covered:
- Exponential family of distributions, maximum likelihood estimation, link function, Fisher scoring
- Models for continuous responses (normal, exponent, gamma and inverse Gaussian distribution),
- Models for binomial responses and for count data (including zero-modified models)
Objectives of the course: The objective of the course is to explain the theoretical foundations behind GLMs (exponential family and the distributions belonging to the family, link and response function, maximum likelihood estimation, Fisher scoring) and to give a hands-on experience by solving related practical problems.
The aim of the practical part of the course involves the two main aspects of statistical modelling.
- The technical aspect: participants of the course learn how to apply a (generalized linear) model, how to choose between models, how to check the model diagnostics and what is the mathematics behind the models.
- The interpretation/business aspect: statistical modelling process does not end with choosing the best model, one needs to be able to critically assess, interpret and communicate the results, and draw conclusions based on the model.
Learning outcomes: Participant who passes this course
- knows exponential family of distributions and is able to use these distributions in modelling
- is able to select models with appropriate link functions and response functions
- is able to fit models with overdispersion
- is able to find a suitable class of models to solve a given problem, estimate the parameters of the models, critically assess and interpret the results and analyze the practical implications
Final assessment: non-differentiated (pass, fail)
Requirements to be met for final assessment:
- Both tests passed. To pass a test, at least 51% of maximum is required.
- At least 60% needs to be acquired from the final test to pass the course.The more points one has obtained from midterm tests, the less questions one needs to answer in final test.
Recommended study materials:
- G. Tutz (2012). Regression for Categorical Data. Cambridge University Press
- P. McCullagh, J.A. Nelder (1989). Generalized Linear Models. Chapman &Hall, London.
- A.F. Zuur, E.N. Ieno, N. Walker, A.A. Saveliev, G.M. Smith (2009). Mixed Effects Models and Extensions in Ecology with R. Springer.
- P. De Jong, G.Z. Heller (2008). Generalized Linear Models for Insurance Data. Cambridge University Press, NY.
Additional information: Meelis Käärik (meelis.kaarik@ut.ee)