SUR and Hierarchical Model

edjee · February 28, 2018, 5:51pm

Hi,
I’m trying to create a varying-slope, varying-intercept hierarchical model with twenty different groups (events in an event study where each event has 199 observations) but relax the assumption that each event is independent. Relaxing this assumption involves, I think (?), stacking the vectors and specifying a covariance matrix across events similar to a SUR model.

I’d appreciate any comments or thoughts on whether the specification below is correct as I’m sceptical of my own stan ability and each 1000 transitions take ~1000 seconds to complete suggesting I’ve violated the Folk theorem. Attached is the same code as below.

Thanks for anything you have to offer,
Ed
SUR_Hierarchical.stan (1.2 KB)

data {
  int<lower=1> N; // Number of observations 
  int<lower=0> L; // Number of events and also number of SUR equations
  vector[L] Y[N]; // Y variable, replicated L = 20 times
  real X[N]; // predictor variable
  int<lower=1, upper=L> id[N]; // id describing which event X and Y correspond to
 
}
parameters{
  vector[L] a[L]; // Intercept term
  vector[L] b[L]; // coefficient on X
  
  vector[L] mu_a; // hierarchical priors for the intercept
  vector<lower=0>[L] sigma_a;
  
  vector[L] mu_b; // hierarchical priors for the predictor
  vector<lower=0>[L] sigma_b;
  
  cholesky_factor_corr[L] L_Omega; // Covariance matrix for the events
  vector<lower=0>[L] L_sigma;
  
}
model {
  vector[L] y_hat[N];
  matrix[L, L] L_Sigma;
  
  mu_a ~ normal(0,5);
  mu_b ~ normal(0,5);
  
  sigma_a ~ normal(0,5);
  sigma_b ~ normal(0,5);
  
  for (l in 1:L){
  a[l] ~ normal(mu_a, sigma_a);
  b[l] ~ normal(mu_b, sigma_b);
  }
   
  
  for (i in 1:N){
    y_hat[i] = a[id[i]] + b[id[i]] * X[i];
  }
  
  L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
  L_Omega ~ lkj_corr_cholesky(4);
  L_sigma ~ cauchy(0, 2.5);
  
  Y ~ multi_normal_cholesky(y_hat, L_Sigma);
}

[edit: escaped code]

Max_Mantei · February 28, 2018, 10:00pm

Hi! I’ll just share a few ideas, maybe they are helpful…

Did you try a non-centered parameterization for a and b?

Also, you might want to think about explicitly modeling the correlation between a and b, since slopes and intercepts should be correlated.

With regards to the sigma for the events: Would it be appropiate in your case to just model it somethink like y ~ normal(y_hat, sigma[index_L]) and then just put a hierarchical prior on sigma?

edjee · March 2, 2018, 4:01pm

Hi Max!
I’ll definitely look into non-centred parameterisation and explicitly modelling a and b - I’m not sure a hierarchical prior on sigma would fix my problem but I’ll give it some more thought.
Thanks a ton,
Ed

Bob_Carpenter · March 4, 2018, 11:39pm

As @Max_Mantei suggests, the hierarchical prior is almost always better—the only case where it’s not is where the posterior’s very well identified.

Modeling the correlations can help, but it’s expensive. @Max_Mantei’s suggestion to just use a normal avoids modeling the correlations.

You can just declare-define L_Sigma up top. The Cauchy prior on L_sigma is not very constraining. This can cause problems if there’s not much data. Tightening it to normal(0, 5) or something if appropriate would be faster.

edjee · March 5, 2018, 5:34pm

Thanks @Bob_Carpenter and @Max_Mantei,

I’ve decided to go with the hierarchical prior on sigma - very grateful for both your comments.

Topic		Replies	Views
Varying intercepts and slopes model Modeling	4	1530	October 31, 2018
Prior for regression coefficients: multivariate normal vs multiple, independent univariate normal Modeling prior-choice , hierarchical-model	6	762	April 8, 2020
My first (hierarchical) model, and it's not working Modeling fitting-issues	10	1078	July 30, 2018
Hierarchical Multivariate Normal model Modeling fitting-issues , performance	6	2145	June 12, 2019
Help with specification of hierarchical priors? Modeling	9	1375	December 4, 2018

SUR and Hierarchical Model

Related topics