Varying intercepts and slopes model

Hello
I suspect this is a basic error I am making but can’t see where I am going wrong. I am trying to run a hierarchical model with a covariance matrix based on Richard McElreath’s code.However I keep getting this error message:
SAMPLING FOR MODEL ‘e3ad553ffed76cee7e2a7a541857f7ba’ NOW (CHAIN 1).
Unrecoverable error evaluating the log probability at the initial value.
Exception: []: accessing element out of range. index 436 out of range; expecting index to be between 1 and 256; index position = 1a_group (in ‘model28341223245f_e3ad553ffed76cee7e2a7a541857f7ba’ at line 42)

[1] “Error in sampler$call_sampler(args_list[[i]]) : "
[2] " Exception: []: accessing element out of range. index 436 out of range; expecting index to be between 1 and 256; index position = 1a_group (in ‘model28341223245f_e3ad553ffed76cee7e2a7a541857f7ba’ at line 42)”
[1] “error occurred during calling the sampler; sampling not done”. The dataset is simply looking at the effect of age on weight in children with food allergies. My code:allergy_data.csv (25.8 KB)
My code:

model_string<-
  "data {
  int<lower=1> N;         // Number of observations
  int<lower=1> N_group;   // Number of groups
  real y[N];              // outcome
  int group[N];           // group label
  real age[N];       // predictor
}

parameters {
  vector[N_group] b_group;
  vector[N_group] a_group;
  real a;
  real b;
  vector<lower=0>[2] sigma_group;
  real<lower=0> sigma;
  corr_matrix[2] Omega;
}

transformed parameters {
  vector[2] vec_ab[N_group];
  vector[2] Mu_ab;
  for (j in 1:N_group) {
    vec_ab[j, 1] = a_group[j];
    vec_ab[j, 2] = b_group[j];
  }
  for (j in 1:2) {
    Mu_ab[1] = a;
    Mu_ab[2] = b;
  }
}

model {
  Omega ~ lkj_corr(2);                       // Prior for correlation matrix
  sigma ~ cauchy(0, 2.5);                    // Prior std dev within groups
  sigma_group ~ cauchy(0, 2.5);              // Prior std dev among intercepts and slopes
  b ~ normal(0, 10);                         // Prior for average slope
  a ~ normal(mean(y), 10);                   // Prior for average intercept
  vec_ab ~ multi_normal(Mu_ab, quad_form_diag(Omega, sigma_group));  # Population of varying effects
  {
    vector[N] mu;
    for (i in 1:N) {                         // Linear model
      mu[i] = a_group[group[i]] + b_group[group[i]] * age[i];
    }
    y ~ normal(mu, sigma);                   // Likelihood
  }
}
"
data_list<-list(N=length(allergy$weight),y=allergy$weight,age=allergy$age,N_group=length(unique(allergy$idnum)),group=allergy$idnum)

I think it may have something to do with the way I am trying to pass the data to Stan,but I can’t see the problem. It’s pointing to where it wants to sample mu for the first time.

Thanks for any advice

Regards
Chris

1 Like

As you already wrote, the problem is in how the data is prepared for Stan, not with the Stan model. It looks as if at least one group index in the variable group is larger than the size of agroup (N_group). One way this can happen is when one does not make sure that the group indices go from 1 to N_group.

Sorry Guido, I’m not following your reply(but thank you for doing so).
Regards
Chris

Hi Guido
I seemed to get this running by generating a variables from 1 to 256,1 for each child but repeated for each observation,as in:


 idnum time allergy   weight        age group_no
1   436    1       1 12.14429 -2.9534250        1
2   436    4       1 17.48723  0.1753426        1
3   436    6       1 15.48387  2.5095890        1
4   463    1       1 10.96306 -2.9945210        2
5   463    4       1 17.08026 -0.3643835        2
6   463    6       1 19.93641  2.6849310        2

I used this simple code to do this:

allergy_2<-as.data.frame(transform(allergy,group_no=as.numeric(factor(idnum)))) 

Thanks for your help.
Regards
Chris

Seeing this only now.
Indeed group_no = as.numerix(factor(my_var))
Insures that the variable group_no increase in increments of 1, which is what you need if you want to use it as an index variable.