3E Designing Your Own Statistical Models Using Maximum Likelihood Estimation
Jonathan Kropko, University of Virginia
8 - 19 August (two week course / 35 hrs)
Jonathan Kropko is an assistant professor of political science at the University of Virginia. Previously,
he held a position as a postdoctoral researcher at Columbia University, and obtained his doctorate from the University of North Carolina at Chapel Hill in 2011.
His work has been published in Political Analysis and Biometrika, and has a textbook on mathematics for the social sciences forthcoming with Sage publishing.
His areas of study are political methodology and voting behavior in American and comparative contexts.
- IThis course will address any analytical problem that involves building a model to explain the variation of a limited (non-normal or non-continuous) dependent variable. The first topic will be a brief review of the topics in mathematics and statistics that are necessary for understanding a likelihood function: probability distributions, joint probability, logarithms, differentiation, optimization, and OLS regression.
- The second topic will be the construction of a generalized linear model, likelihood and log-likelihood functions, and maximization.
- The remaining topics will be applications of GLM and MLE for the analysis of limited dependent variables. We will emphasize the interpretation of model results above and beyond the sign and significance of coefficients (e.g. marginal effects, various intuitive quantities of interest). First we will cover logit, probit, and other models for the analysis of binary dependent variables. Then we will cover models for ordinal variables (ordered logit, ordered probit), nominal variables (multinomial logit, conditional logit, multinomial probit), and count variables (poisson, negative binomial, zero-inflation). We will wrap up the course by talking about survival models (Weibull, Cox proportional hazards).
- The course has four objectives. By the end of the course, students will be able to:
- 1. Craft a generalized linear model that is appropriate for a binary, ordinal, nominal, count, or duration dependent variable and estimate this model in Stata.
- 2. Interpret the results of standard GLMs in ways that express the magnitude of an effect, in addition to the direction and significance.
- 3. Explain the principles that underlie maximum likelihood estimation and how they relate to specific data applications.
- 4. Understand that the standard models are only examples of the use of GLM and MLE, that these tools can be used to craft models that are custom built for a particular theoretical application.
- This course will give students a solid foundation to conduct thorough analyses using classic models, to explore more complicated GLMs on their own, and to build novel new models.
- The course should be taken subsequent to a course on linear regression using OLS. For an applied researcher, this course will be an initiation to research applications that use advanced probability models.
- Before taking this course, students should be familiar with topics in probability including density functions, hypothesis testing, and moments. They should also be strong with linear regression and should have no problem interpreting coefficients. A strong background in mathematics, especially calculus, will be very helpful.
Representative Background Reading
- Messer, Anne, Joost Berkhout, and David Lowery. 2011. “The Density of the EU Interest System: A Test of the ESA Model.” British Journal of Political Science. 41(1): 161 – 190
- Cameron, A. Colin and Pravin K. Trivedi. 2010. Microeconomics Using Stata. College Station, TX: Stata Press.