1E Introduction to Multilevel Models with Applications

Paul Lambert, University of Stirling
8 - 19 July (two week course / 35 hrs) 

Detailed Course Outline [PDF]

Course Content

Social science data often features the ‘clustering’ or ‘hierarchical nesting’ of individual cases within larger units of analysis - for example in household surveys, when there may be several individual responses clustered within the same household. Multilevel models (also known as hierarchical random effects models) are statistical models which provide analytical tools for dealing with data of this nature. They provide a convenient means to undertake regression analysis which takes account of, and can help to summarise, patterns of clustering. As such multilevel models are an important tool in social statistics, and are of potential relevance to almost any study using complex social data.

This course provides an applied introduction to multilevel modelling for social science datasets. It will introduce the statistical features of multilevel models, deal with approaches to handling data which has clustered or hierarchical elements, and provide training in specifying multilevel models for linear and categorical outcome variables in a variety of survey data scenarios. The course has an emphasis on the practical application of multilevel models, and will seek to convey both the attractions and limitations of a multilevel modelling approach.

Course Objectives

The course seeks to provide participants with a solid grounding in the application of multilevel models. This involves combining a strong understanding of how multilevel models are formulated in statistical terms (and their relationship to other types of statistical model), with a fluency in handling data with clustered and hierarchical features and an ability to specify multilevel models in popular statistical analysis packages.

The course will feature lab sessions with command files which illustrate handling data and specifying multilevel models in several software packages. Most often, lab examples use the Stata package, since that software features a wide range of options both for handling complex data, and for specifying multilevel models. Other examples are given, however, which use SPSS (which has many facilities for handling data but a more limited range of options for multilevel modelling); R (which has a particularly wide range of analysis options, but is a relatively challenging package to use for most social science data projects); and MLwiN (a specialist software, designed explicitly for estimating multilevel models, which also provides an attractive interface for understanding the specification and properties of statistical models). Worked examples will be available in these packages and using several different, often large scale, social survey datasets: this is an ambitious objective which seeks to provide participants with important operational skills which are not widely taught.

There are a number of benefits to studying the practical application of multilevel models. Firstly, multilevel models are important devices for exploring the character of clustered or hierarchical structures within a dataset (for example, to compare the scale of pupil-level and class-level influences in an educational study which features pupils clustered within classes). Secondly, they are often used simply to control for hierarchical structural features within data (that is, when a pattern of clustering is not substantively important, but does need to be controlled for). Finally, a thorough introduction and review of the practical implementation of multilevel models also serves as an effective means of refreshing understanding of the implementation and interpretation of statistical models in the social sciences more generally.

Course Prerequisites

This is an introductory course, but participants will benefit from having moderate levels of previous statistical training, and relevant experience in statistical software applications. In addition, prior to attending the course, all participants would benefit from reading a research article which uses a multilevel model, and an introductory article or chapter on the methodology (suggestions given below).

The course is suitable for participants who have received statistical training at least to the level of understanding the application of conventional regression modelling approaches (e.g. multiple regression and logistic regression), and who are fluent in popular descriptive analytical techniques and the statistical tests behind them (e.g. chi-square tests and correlation calculations). Most participants are likely to benefit from preparatory study or revision of materials which cover generating and interpreting the outputs from conventional regression analyses, such as on coefficient effects and indicators of model fit (e.g. Allison 1999; Tarling 2009). The course will take conventional regression models as its starting point, and build onwards to multilevel models and other related extensions topics in statistical modelling.

The course is best suited to participants with a basic level of fluency in using statistical software packages for social science data analysis. The course features lab material spanning several packages (Stata, SPSS, R and MLwiN, with Stata used most often) and using several different social science datasets. Previous exposure to the syntax languages of Stata or SPSS will be an advantage, since the practical materials involve programming in such languages. The course should be accessible to people who have little previous experience in this area, since background materials on the software packages will be made available, but students without some background in the programming of software using syntax should be prepared that extra effort will probably be required during the opening days of the course in order to follow the lab exercises. In addition, in advance of the course, participants will be expected to read the extended software guide prepared for the course, which discusses and illustrates modes of operation of the packages involved (to be circulated in advance of the session).  

Software used (and suggested introductory online information): Stata (http://www.longitudinal.stir.ac.uk/Stata_support.html) SPSS (http://www.spsstools.net/spss.htm) MLwiN (http://www.cmm.bristol.ac.uk/MLwiN/download/manuals.shtml) R (http://www.ats.ucla.edu/stat/r/)

We stress that not every example is available in every package. Stata will be used much more than the other packages, and coverage of R and of SPSS is quite brief by comparison. .

Reading

The following text will be included in the coursepack and used throughout the course:

Hox, J.J. 2010. Multilevel Analysis: Techniques and Applications , 2nd Edition. London: Routledge.

Background Reading

Rabe-Hesketh, S. and Skrondal, A. 2008/12. Multilevel and Longitudinal Modelling Using Stata, Second Edition/Third Edition. College Station, Tx: Stata Press.

[top of page]