Maximum Likelihood Models
Randy Stevenson, RICE University
4 - 15 August (two week course / 35 hrs)
Randy Stevenson is a Professor of Political Science at Rice University in Houston, Texas, where he has taught the graduate statistics sequence for the past 16 years. He has published articles on coalition politics and voting behaviour in coalitional systems in the APSR, AJPS, JOP, BJPS, and many other journals. My book with Raymond Duch, The Economic Vote, won the Luebbert Award for the best book in Comparative politics in 2007-2008. He is currently working on a book that explores how voters in complex coalitional systems use simple heuristics to master the complexity of those systems and cast “coalition-directed” votes.
- In this course, students will learn how to build a wide variety of statistical models by properly specifying a likelihood function appropriate to their theory and data. Next, they will learn how to estimate the unknown parameters of these models using maximum likelihood estimation, as well as how to produce measures of uncertainty (standard errors) around these estimates. Next, they will learn how to use the estimates of the parameters of the model to interpret its substantive implications - mainly by calculating substantive effects of the form “my estimates suggest that a 10% increase in an individual’s income increases their chances of turning out to vote by 12%.” Finally, students will learn how to use simulation techniques to put confidence intervals around these substantive effects (i.e.,of the form “my estimates suggest that a 10% increase in an individual’s income increases their chances of turning out to vote by 12%, plus or minus 4%.” Throughout the course there will be an emphasis on how to best describe and explain the models they build and how best to communicate substantive implications to a broad academic audience.
- The foundation of building a statistical model is proper development of a likelihood function and that requires an understanding of probability distributions. Thus, we will start with a brief introduction to probability theory at a level appropriate for students with no background in probability theory.
- The specific models we will cover are the Bernoulli-logistic model (logit), the normal-linear model (regression), ordered logit, duration models (e.g., exponential model, wiebull model), event count models (e.g., Poisson, negative binomial). All these models are similar in one important respect. They are appropriate when the rows of one’s data can be treated as independent observations. If there is time after we have covered these models, we may discuss a number of models in which rows of data cannot be considered independent observations (e.g., conditional and multinomial logit models, compositional models, error-components or random effects models).
- After finishing this course students should be able to use a wide variety of statistical models in their own work, understand the underlying assumptions of these models, be able to explain the ways in which the models are appropriate or not for the theory and data at hand, and to develop and interpret the substantive implications of the statistical estimates produced by these models. These skills are the essential tools of the quantitative social scientists and indeed of most government and business analysts.
- This is intended to be a first course in statistical modelling. It is specifically designed as such and requires no former training (not even a regression course). Knowledge of basic calculus will be useful – though not strictly essential. No matrix algebra will be required. That said, statistical models are mathematical models and so we will use a lot of basic algebra and mathematical notation in order to formalize our theoretical intuitions into mathematical (statistical) models. Students should be ready to consume and produce models presented in this way.