The simplest varying intercepts/slopes model there is

I’m trying to find the absolutely simplest/shortest Stan model (varying slopes and intercepts) to use as an example. This is the example (Ch. 1.13 in the User’s Guide), which I’ve tried to simplify even further:

data {
  int N;                   // num individuals
  int K;                   // num ind predictors
  int J;                   // num groups
  int L;                   // num group predictors
  int jj[N];               // group for individual
  matrix[N, K] x;          // individual predictors
  row_vector[L] u[J];      // group predictors
  vector[N] y;             // outcomes
}
parameters {
  corr_matrix[K] Omega;    // prior correlation
  vector<lower=0>[K] tau;  // prior scale
  matrix[L, K] gamma;      // group coeffs
  vector[K] beta[J];       // indiv coeffs by group
  real<lower=0> sigma;     // prediction error scale
}
model {
  row_vector[K] u_gamma[J];
  for (j in 1:J)
    u_gamma[j] = u[j] * gamma;
  beta ~ multi_normal(u_gamma, quad_form_diag(Omega, tau));
  for (n in 1:N)
    y[n] ~ normal(x[n] * beta[jj[n]], sigma);
}

I’ve removed priors. I’ve removed most of the <lower> and <upper> checks.

Can we simplify it even further (but keep the correlation matrix)? Efficiency, validity, etc. is not the issue here - we want to use as few statements as possible.

Any input is much appreciated!

2 Likes

If your only goal is simplicity in terms of expressing the model I would use the covariance matrix instead of the correlation matrix and scale vector. I can’t really think of another way to simplify this model.

1 Like

Obviously - I forgot about that - thanks Maurits!

@MauritsM, then we need a transformed parameters block? Kind of defeats my aim to introduce that here… :)

No (of I understand correctly). My approach would have a matrix D that replaces your Omega and tau, as well as quad_form_diag. The way those three are used is to create a covariance matrix out of those individual components.

Could you edit my example above and show?

This is what I had in mind

data {
  int N;                   // num individuals
  int K;                   // num ind predictors
  int J;                   // num groups
  int L;                   // num group predictors
  int jj[N];               // group for individual
  matrix[N, K] x;          // individual predictors
  row_vector[L] u[J];      // group predictors
  vector[N] y;             // outcomes
}
parameters {
  cov_matrix[K] D;    // prior covariance
  matrix[L, K] gamma;      // group coeffs
  vector[K] beta[J];       // indiv coeffs by group
  real<lower=0> sigma;     // prediction error scale
}
model {
  row_vector[K] u_gamma[J];
  for (j in 1:J)
    u_gamma[j] = u[j] * gamma;
  beta ~ multi_normal(u_gamma, D);
  for (n in 1:N)
    y[n] ~ normal(x[n] * beta[jj[n]], sigma);
}

Does this make sense? By the way, is this simplification for teaching purposes?

1 Like

Oh oops, I missed the part about keeping the correlation matrix! Apologies, I don’t see any improvements in that case…

Thanks, Maurits, I think your example is clearer. Yeah, you could say it’s for teaching purposes, or at least allowing people quick access to the most common models written in Stan :) I’ll put you in the acknowledgments!

1 Like

What code would you use for generating data for this model?

Thanks @torkar, although I don’t know if I deserver credit for such a small adaptation of your model :-)

I don’t really have a great example on hand that uses the full structure and all the parameters. I think that after removing all the priors it may be quite difficult to get this model to fit, even on simulated data.

When I have some more time I’ll try to simulate some fake data for this kind of model. Is R code ok for this?

The original model is from the user’s guide. Sure R code is great. I’ll see if I too can look at it tomorrow :)

Check out the make_data.r file here. I use different terminology, and use a hack to add a between-subjects variable/effect, but it’s the same model.

2 Likes