Very slow multilevel model




I am fitting a multilevel model to a collection of J curves with total N data points. I have a group level predictor grp_level_x for each of the J curves.
And I am fitting the following functional form:
y = (A*((x-B)/(100-B))^(-n))

The following stan code runs very slow and I am unable to run it long enough to see if this works. I had implemented an unpooled version before and that gave me reasonable results so now I was trying to get a multilevel model working. I am relatively new to stan so any help is appreciated. Thanks,

data {

int<lower=4> N;
  int<lower=0> J;
  vector[N] y;
  vector[N] x;
  int curveId[N];
  vector[J] grp_level_x;
transformed data{

parameters {
   vector<lower = 0.25,upper=5>[J] n;
   vector<lower=0>[J] A;
   vector[J] B;
   real<lower=0> eps;
   // B model
   real alpha_B;
   real beta_B;
   real<lower=0> eps_B;
   // n model
   real alpha_n;
   real beta_n;
   real<lower=0> eps_n;
   // A model
   real alpha_A;
   real beta_A;
   real<lower=0> eps_A;
transformed parameters {

model {
  vector[N] y_hat;
  for (i in 1:N){
    y_hat[i] = (A[curveId[i]]*((x[i] - B[curveId[i]])/(100. - B[curveId[i]]))^(-n[curveId[i]]));
  for (j in 1:J){
      B[j] ~ lognormal(alpha_B + beta_B*grp_level_x[j],eps_B);
      n[j] ~ lognormal(alpha_n + beta_n*grp_level_x[j],eps_n)T[0.25,10];
      A[j] ~ lognormal(alpha_A + beta_A*grp_level_x[j],eps_A);
  y ~ normal(y_hat,eps);


First off, add priors for eps_*, alpha_*, and beta_*. If you don’t know anything else, just do normal(0.0, approrpriate_guess_of_scale).

If this doesn’t work, and since your simple models work, what I’d do is:

  1. Just do J independent fits in one model and see if it works. Something like:
for (j in 1:J){
      B[j] ~ normal(0.0, 2.0);
      n[j] ~ normal(0.0, 2.0);
      A[j] ~ normal(0.0, 2.0);

Or whatever your prior was in the original fits. I think lognormals are kinda heavy-tailed, so if you just need a zero avoiding prior, maybe switch to a gamma?

  1. Make it hierarchical, but don’t do group level predictors
for (j in 1:J){
      B[j] ~ lognormal(alpha_B, eps_B);
      n[j] ~ lognormal(alpha_n, eps_n)T[0.25,10];
      A[j] ~ lognormal(alpha_A,eps_A);
  1. Add in the group level predictors

Maybe they need centered or something. Hopefully something in there works!

edit: fixed formatting


Thanks Ben. I had been following a similar route but as soon as I jump to hierarchical (even #2 above) the code gets stuck. If I run 4 chains, a couple of them run very fast but the remaining stay stuck for multiple hours.
One other question I have is regarding my likelihood function

y_ij ~ A_j ((x - b_j)/(100 - b_j)) ^(-n_j)

It’s not really defined for x’s smaller than b_j. For each of the J curves the minimum value of x is different.How do I specify that and is this something which might be causing stan to get stuck?



Priors on the hyperparameters didn’t help at all?

It sounds like you want to constrain all the values of B to be greater than the minimum x in each curve fit?

Vector constraints aren’t coded in Stan yet, but you can do this stuff manually. Check the “Transformations of Constrained Variables” section in the manual.

Something like this, maybe:

data {
  real max_x_in_group[J];  // Pass in the maximum x values in each group (B can't go less than this)

parameters {
  real B_unconstrained[J];

// Transform from unconstrained to constrained space
transformed parameters {
  real B[J];
  for(i in 1:J) {
    B[i] = exp(B_unconstrained[i]) + max_x_in_group[i];

model {
  // Add in Jacobian adjustment
  for(i in 1:J) {
    target += B_unconstrained[i];


No, the priors on hyperparameters didn’t help. Thanks for the transformation trick.


That’s a bummer. I had high hopes


The scale of declaration doesn’t match the truncation (5 and 10).

For speed, you should vectorize everything other than the truncation one.

Have you tried simulated data? Often problems can arise if the model’s not well specified.