I’m building a hierarchical regression model with both varying intercept and varying slopes.
Stan documentation has an excellent section on “Multivariate prior for hierarchical model”. However, instead of using multivariate normal as prior, I want to use multiple, independent univariate normal so that my Stan file is simpler. Specifically,
transformed parameters {
beta0 = mu_beta0 + sigma_beta0 * beta0_raw;
beta1 = mu_beta1 + sigma_beta1 * beta1_raw;
}
model {
beta0_raw ~ normal(0, 1);
beta1_raw ~ normal(0, 1);
y ~ normal(beta0 + beta1 * x, sigma);
}
Note how beta0
and beta1
have independent, univariate normal prior (in the form of non-centered parameterization).
The later is much easier to code, especially when we also want to do non-centered parameterization.
Obviously the potential problem here is that I’m assuming that the intercept and the slopes are independent. How bad is that assumption? How to diagnose that this assumption in the prior is causing me problems?
Full model below, in case the problem I’m having is due to something else in the model other than the independent prior. It’s a hierarchical regression model with measurement error in both x and y.
/**
* Hierarchical regression with measurement error in x and y
*/
data {
int<lower=0> N; // Number of observations
int<lower=0> J; // Number of groups (region-vertical)
vector[N] x;
vector<lower=0>[N] se_x;
vector[N] y;
vector<lower=0>[N] se_y;
int<lower=1,upper=J> group[N]; // Group membership of each observation
}
parameters {
// regression parameters
vector[J] beta0_raw; // non-centered parameterization
vector[J] beta1_raw; // non-centered parameterization
real<lower=0> sigma;
// hyper-prior for the regression coefficients
real mu_beta0;
real mu_beta1;
real<lower=0> sigma_beta0;
real<lower=0> sigma_beta1;
// latent values
vector[N] x_lat_raw; // non-centered parameterization
real mu_x_lat;
real<lower=0> sigma_x_lat;
vector[N] y_lat;
}
transformed parameters {
// non-centered parameterization for regression coefficient
vector[J] beta0 = mu_beta0 + sigma_beta0 * beta0_raw;
vector[J] beta1 = mu_beta1 + sigma_beta1 * beta1_raw;
// non-centered parameterization for x_lat
vector[N] x_lat = exp(mu_x_lat + sigma_x_lat * x_lat_raw);
// vectorizing the mu_yhat calculation, which is a linear combination of x
// (note how the corresponding beta is picked based on the group membership)
vector[N] mu_y_lat;
for (i in 1:N) {
mu_y_lat[i] = beta0[group[i]] + beta1[group[i]] * x_lat[i];
}
}
model {
// Hierarchical prior on x_lat
x_lat_raw ~ normal(0., 1.);
mu_x_lat ~ normal(0., 5.);
sigma_x_lat ~ normal(0., 5.); // practically a half normal given the lower=0 constraint
// Regression
x ~ normal(x_lat, se_x);
y ~ normal(y_lat, se_y);
y_lat ~ normal(mu_y_lat, sigma);
// Hierarchical Prior on regression coefficients
beta0_raw ~ normal(0, 1);
beta1_raw ~ normal(0, 1);
mu_beta0 ~ normal(0, 1);
mu_beta1 ~ normal(0, 1);
sigma_beta0 ~ normal(0, 5);
sigma_beta1 ~ normal(0, 5);
sigma ~ normal(0., 5);
}