Issues with hierarchical linear regression stan model

Please share your Stan program and accompanying data if possible.


I’m a student relatively new to stan and would appreciate some help on my model. I’m trying to fit a model for the following regression line:
y_{ij} \sim N(\mu_j + \beta_jx_{ij}, \sigma)
\mu_{ij} \sim N(\nu_{\mu}, \tau_{\mu})
\beta_{j} \sim N(\nu_{\beta}, \tau_{\beta})
All the \nu s are distributed as N(0,1), and \tau and \sigma as gamma(1,1)
I’ve tried fitting this model but it doesn’t converge and I get an R-hat value much greater than 1

data {
  int<lower=0> N;
  int<lower=0> J;
  vector[N] y;
  vector[N] x;
  int group[N];
}
parameters {
  real<lower=0> sigma;
  real<lower=0> tau_mu;
  real<lower=0> tau_beta;
  real nu_mu;
  real nu_beta;
  vector[J] mu;
  vector[J] beta;
}
model {
  nu_mu ~ normal(0, 1);
  nu_beta ~ normal(0, 1);
  tau_mu ~ gamma(1,1);
  tau_beta ~ gamma(1,1);
  sigma ~ gamma(1,1);
  for (j in 1:J) {
    mu[j] ~ normal(nu_mu, tau_mu);
  }
  for (j in 1:J) {
    beta[j] ~ normal(nu_beta, tau_beta);
  }
  for (i in 1:N){
    y[i] ~ normal(mu[group[i]]+ beta[group[i]]*x[i], sigma);
  } 
} 

Hi Caroline,

Hierarchical models specified like this (centered parameterisation) can be difficult for the sampler, more info on that in this chapter of the User’s Guide: Stan User’s Guide

As covered in that chapter, hierarchical models will often (but not always) sample better when a non-centered parameterisation is used. This is when the parameter is specified as a function of a standard normal distribution. To specify this simply with Stan you just need to use the offset and multiplier options when declaring your parameters:

  vector<offset=nu_mu, multiplier=tau_mu>[J] mu;
  vector<offset=nu_beta, multiplier=tau_beta>[J] beta;

Additionally, rather than using normal(0,1) priors you can use the std_normal prior, which is a bit more efficient.

Also, Stan’s distributions are vectorised, so you can declare priors for entire vectors without loops, like:

  mu ~ normal(nu_mu, tau_mu);

Putting this all together, your model ends up looking like:

data {
  int<lower=0> N;
  int<lower=0> J;
  vector[N] y;
  vector[N] x;
  int group[N];
}
parameters {
  real<lower=0> sigma;
  real<lower=0> tau_mu;
  real<lower=0> tau_beta;
  real nu_mu;
  real nu_beta;
  vector<offset=nu_mu, multiplier=tau_mu>[J] mu;
  vector<offset=nu_beta, multiplier=tau_beta>[J] beta;
}
model {
  nu_mu ~ std_normal();
  nu_beta ~ std_normal();
  tau_mu ~ gamma(1,1);
  tau_beta ~ gamma(1,1);
  sigma ~ gamma(1,1);

  mu ~ normal(nu_mu, tau_mu);
  beta ~ normal(nu_beta, tau_beta);

  y ~ normal(mu[group] + beta[group] .* x, sigma);
} 
1 Like

Thank you! That clears things up a bit.