3C Pooled Time-Series Cross-Sectional Analysis
Robert Walker, Willamette University, Oregon.
5 - 16 August (two week course / 35 hrs)
Detailed Course Outline [PDF]
Course Content
This course is designed for students who already have training in basic statistics and knowledge of linear regression analysis. The course deals with problems arising from combining the time and space dimension in statistical data analysis. In particular, we will work with aggregated time series cross-sectional data e.g. countries over time. This data structure has the advantage of allowing for testing highly general theories with a wide scope but renders data analysis more complicated because one has to consider the time series aspects (dynamics) and cross-sectional aspects (spatial correlation/unit heterogeneity) at the same time. The course gives an overview of the problems arising from this complex data structure and also provides techniques to control and account for specific complications. We will start out by discussing characteristics and types of pooled data and underlying assumptions of basic statistical models for panel data. In addition, this course shows how to deal with specification problems such as complex error structures, different kinds of heterogeneity (e.g. unit and slope), dynamic specification issues (lag structures), missing data, spatial heterogeneity and dependency, time invariant and rarely changing variables in panel data analysis with correlated unit specific effects among others. Furthermore, we will look at different data generating processes and adequate estimation procedures for e.g. binary choice and limited dependent variable models. The course combines a more theoretical introduction with practical analysis of diverse data sets using STATA. Students are encouraged to bring their own data sets and present their research projects and empirical analysis during the course.
Course Objectives
The course requires knowledge of inferential statistics, (some) calculus and considerable linear algebra (matrices) and is designed to further develop the understanding of statistical problems arising from the complex structure of pooled data. The course mostly deals with questions of specification and model choice and is therefore a very practical course which should enable students to link their empirical models closer to their theoretical arguments and make model choices that are adequate for the data structure at hand. The course materials are designed to help participants to solve their own estimation problems and increase the reliability and efficiency of their statistical results. The course is targeted at social and political scientists as well as economists with average statistical skills with a strong interest in applied empirical research and data analysis. The focus lies on practical problems of macro panel data analysis.
Course Prerequisites
The course requires average or better skills and knowledge in inferential statistics, including basic understanding of maximum likelihood and generalized linear estimation methods. In addition, participants should have an understanding of matrix algebra and, to a lesser extent, calculus, though the main focus of the course is applied. In addition, participants are required to have a basic familiarity with STATA. The course is designed to build on a good working knowledge of cross-section multiple regression models and basic multivariate time-series models. This includes knowledge of the underlying assumption of basic linear models and how to deal with violations (heteroskedasticity, autocorrelation) of those assumptions. Participants should be able to interpret regression co-efficients, standard errors and significance tests.
Reading
A course reader will be assembled in place of a central textbook.
Baltagi, Badi H. 2008. Econometric Analysis of Panel Data. Wiley and Sons Ltd.
Hsiao, Cheng, 2003. Analysis of Panel Data. Cambridge University Press.
