In the Data Analysis Using Regression and Multilevel/Hierarchical Models (ARM) book by Gelman & Hill, it discusses Five ways to write the same model in section 12.5.
The final example is a Large regression with correlated errors. It defines the covariance matrix as:
For any unit i: var_y+var_alpha
For any units i,k within the same group j: var_alpha
For any units i,k in different groups: cov(error_i, error_k)=0
What if instead, these are adjusted to:
For any unit i: var_y+var_alpha+var_base
For any units i,k within the same group j: var_alpha+var_base
For any units i,k in different groups: cov(error_i, error_k)=var_base
where var_base is some base level of correlation among the variables.
I would like to do this in Stan without having to write the explicit covariance matrix. I tried some naive attempts to adopt a model similar to the radon_no_pool.stan example model to account for it, but didn’t have much luck (divergences, tree depth issues, Bayesian fraction of information issues…the works). I suspect it’s because I was basically introducing a latent variable that couldn’t be identified properly.
So I suppose the big question is: am I wasting my time thinking about this?
I’m fitting a cross-sectional model for simplicity, but the full dataset is a panel. I was just curious if I can incorporate the correlation structure that is evident when looking at the full panel in the cross-section.