Panel data analysis--fixed or hierarchical?


#1

Hi there,

I have a very general doubt on how to best analyze some panel data I have, and I was hoping somebody here could give me some insights or point me to some resources I could use to understand this better.

The situation is the following. I have data in which I observe individuals over several years. For each year I further have several measures of the same construct. I have often seen data like this analysed by simply taking the mean across the different measures per individual and year, and then running a fixed effects panel data regression on those means (i.e., the difference between the individual-year means and the means for a given individual across years). However, I think there must be better ways to fully account for the richness of the data.

One possibility may be to build a measurement error model, and to treat the observation for each individual and year as a latent variable. One doubt I was having is how I would estimate such a model. Staying close to the traditional approach, I may want to estimate a mean and standard deviation for each individual and year, take the mean across years by individual, and then use differences from the mean in the regression to obtain fixed effects. This ought to incorporate measurement uncertainty, but would otherwise remain on familiar ground.

However, would anything speak against estimating a hierarchical model in which I estimate the mean per individual (possibly hierarchically nested in an overall distribution), together with the deviations by individual across the years? I am not sure if this is done, or what the exact implications of such an approach would be–presumably shrinkage towards the mean value of the individual? Has something like this been done, and are there any sources that discuss the pros and cons of different approaches? Any indication would be much appreciated!

Cheers,

Ferdinand


#2

I think this isn’t being answered because you’re asking for a hierarchical modeling textbook. You could do a lot worse than starting with Gelman and Hill’s. They’re rewriting the first half of it to use Stan and it should be out soon-ish (months, not weeks). But the old version answers a lot of these questions.

What you usually want to do is if you have year over year panels is to build some kind of time series of parameters for each unit. That can look like a measurement error model in that there’s a latent true level varying over the years then observations that link to the true levels of interest.