Error calling sampler when fitting hierarchical model

I’m fairly new to Stan and Bayesian, so apologies if the resolution is fairly obvious.

I’m trying to estimate the following hierarchical model in Stan using the RStan interface, but i’m receiving this error when I try to engage the sampler:

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
error occurred during calling the sampler; sampling not done.

I’ve no clue how to even start resolving the issue. I’d really appreciate a solution or someone to point me in the right direction.

data {
  int<lower=1> N; //Number of data points
  int behaviour[N];  //Outcome - QRP Own Behaviour
  //Clustering Factor
  int respondent[N]; //Vector of respondent ID
  int K; //Number of respondents
  //Measurement Level Predictors
  real prevalence[N]; //Pure measurement level variance for prevalence
  //Person Level Predictors
  real prevalence_b[N]; //Cluster means for prevalence
}
parameters {
  real alpha_raw[K]; //Intercepts 
  real beta_p_r[K]; //Respondent slopes for prevalence
  real beta_p_b; //Slope for person level prevalence 
  real<lower=0> sigma_resp[K]; //respondent SD
  real alpha_top; //alpha - pooled
  real<lower=0> alpha_sigma;
  real beta_p_top; //slope coefficient for prevalence - pooled
  real<lower=0> beta_p_sigma; //SD for prevalence - pooled
  real lambda; //
}
transformed parameters {
  real alpha[K];
  for (i in 1:K) {
    alpha[i] = alpha_sigma * alpha_raw[i] + alpha_top 
      + beta_p_b * prevalence[i];
  }
  }
model{
  for(i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    behaviour[i] ~ lognormal(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  //priors
  alpha_raw ~ normal(0, 1);
  beta_p_r ~ normal(beta_p_top, beta_p_sigma);
  sigma_resp ~ student_t(0,10,1);
  lambda~ normal(0,10);
  //hyper-priors
  alpha_top ~ normal(0,2);
  beta_p_top ~ normal(0,2);
  alpha_sigma ~ normal(0,2);
  beta_p_sigma ~ normal(0,2);
  beta_p_b ~ normal(0,2);
}
generated quantities{
  real alpha_overall;
  real beta_p_overall;
  real logLikelihood[N];
  vector[N] y_rep;
  
  alpha_overall = lognormal_rng(alpha_top, alpha_sigma);
  beta_p_overall = lognormal_rng(beta_p_top, beta_p_sigma);
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    y_rep[i] = lognormal_rng(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    logLikelihood[i] = lognormal_lpdf(behaviour[i] | alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
}
}

Does your outcome (behaviour) include any zero values? That would cause issues with the lognormal

Also, in your transformed parameters:

    alpha[i] = alpha_sigma * alpha_raw[i] + alpha_top 
      + beta_p_b * prevalence[i];

Should that be beta_p_b * prevalence_b[i], since this is the between-level?

1 Like

Hi @andrjohns

You’re absolutely right about the personal level variable ‘prevalence_b’. I’ve now adjusted this

There are zeros in (behaviour), but i’ve used adjusted to use a ‘normal’ likelihood distribution, and it still returns the same error message.

Can you post your updated model?

Yes, here it is:

data {
  int<lower=1> N; //Number of data points
  int behaviour[N];  //Outcome - QRP Own Behaviour
  //Clustering Factor
  int respondent[N]; //Vector of respondent ID
  int K; //Number of respondents
  //Measurement Level Predictors
  real prevalence[N]; //Pure measurement level variance for prevalence
  //Person Level Predictors
  real prevalence_b[N]; //Cluster means for prevalence
}
parameters {
  real alpha_raw[K]; //Intercepts 
  real beta_p_r[K]; //Respondent slopes for prevalence
  real beta_p_b; //Slope for person level prevalence 
  real<lower=0> sigma_resp[K]; //respondent SD
  real alpha_top; //alpha - pooled
  real<lower=0> alpha_sigma;
  real beta_p_top; //slope coefficient for prevalence - pooled
  real<lower=0> beta_p_sigma; //SD for prevalence - pooled
  real lambda; //
}
transformed parameters {
  real alpha[K];
  for (i in 1:K) {
    alpha[i] = alpha_sigma * alpha_raw[i] + alpha_top 
      + beta_p_b * prevalence_b[i]; //incorporating person-level predictors
  }
}
model{
  //likelihood
  for(i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    behaviour[i] ~ normal(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  //priors
  alpha_raw ~ normal(0, 1);
  beta_p_r ~ normal(beta_p_top, beta_p_sigma);
  sigma_resp ~ student_t(0,10,1);
  lambda~ normal(0,10);
  //hyper-priors
  alpha_top ~ normal(0,2);
  beta_p_top ~ normal(0,2);
  alpha_sigma ~ normal(0,2);
  beta_p_sigma ~ normal(0,2);
  beta_p_b ~ normal(0,2);
}
generated quantities{
  real alpha_overall;
  real beta_p_overall;
  real logLikelihood[N];
  vector[N] y_rep;
  
  alpha_overall = normal_rng(alpha_top, alpha_sigma);
  beta_p_overall = normal_rng(beta_p_top, beta_p_sigma);
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    y_rep[i] = normal_rng(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    logLikelihood[i] = normal_lpdf(behaviour[i] | alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
}
}

I’ll have a look in more depth, but something that jumps out is that you’ve defined lambda as a parameter but it’s not used anywhere. Does that change anything?

Apologies, that’s a remnant of when I was using a exp_mod_normal distribution (which worked with less complex models - e.g., models without the level-2 variable, ‘prevalence_b’). Find the updated model below.

I really appreciate your help, thank you!

data {
  int<lower=1> N; //Number of data points
  int behaviour[N];  //Outcome - QRP Own Behaviour
  //Clustering Factor
  int respondent[N]; //Vector of respondent ID
  int K; //Number of respondents
  //Measurement Level Predictors
  real prevalence[N]; //Pure measurement level variance for prevalence
  //Person Level Predictors
  real prevalence_b[N]; //Cluster means for prevalence
}
parameters {
  real alpha_raw[K]; //Intercepts 
  real beta_p_r[K]; //Respondent slopes for prevalence
  real beta_p_b; //Slope for person level prevalence 
  real<lower=0> sigma_resp[K]; //respondent SD
  real alpha_top; //alpha - pooled
  real<lower=0> alpha_sigma;
  real beta_p_top; //slope coefficient for prevalence - pooled
  real<lower=0> beta_p_sigma; //SD for prevalence - pooled
}
transformed parameters {
  real alpha[K];
  for (i in 1:K) {
    alpha[i] = alpha_sigma * alpha_raw[i] + alpha_top 
      + beta_p_b * prevalence_b[i]; //incorporating person-level predictors
  }
}
model{
  //likelihood
  for(i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    behaviour[i] ~ normal(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  //priors
  alpha_raw ~ normal(0, 1);
  beta_p_r ~ normal(beta_p_top, beta_p_sigma);
  sigma_resp ~ student_t(0,10,1);
  //hyper-priors
  alpha_top ~ normal(0,2);
  beta_p_top ~ normal(0,2);
  alpha_sigma ~ normal(0,2);
  beta_p_sigma ~ normal(0,2);
  beta_p_b ~ normal(0,2);
}
generated quantities{
  real alpha_overall;
  real beta_p_overall;
  real logLikelihood[N];
  vector[N] y_rep;
  
  alpha_overall = normal_rng(alpha_top, alpha_sigma);
  beta_p_overall = normal_rng(beta_p_top, beta_p_sigma);
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    y_rep[i] = normal_rng(alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
  }
  for (i in 1:N) {
    int a_respondent;
    a_respondent = respondent[i];
    logLikelihood[i] = normal_lpdf(behaviour[i] | alpha[a_respondent] + beta_p_r[a_respondent]*prevalence[i], sigma_resp[a_respondent]);
}
}

So in your construction of the intercept alpha you’re looping over K with the between-level prevalence prevalence_b[i]:

transformed parameters {
  real alpha[K];
  for (i in 1:K) {
    alpha[i] = alpha_sigma * alpha_raw[i] + alpha_top 
      + beta_p_b * prevalence_b[i]; //incorporating person-level predictors
  }
}

But in the data block, it’s declared as length N:

  real prevalence_b[N]; //Cluster means for prevalence

Can you double-check that the data you’re passing in for prevalence_b? I’m guessing it should be length K

If I declare behaviour_b as length K I receive the following error message:

"Error in mod$fit_ptr() :
Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=prevalence_b; position=0; dims declared=(4143); dims found=(37246) (in ‘modelc8355303230_Varying_Slope_Model_NoCov’ at line 20)

failed to create the sampler; sampling not done"

I want the allow the mean prevalence scores for each person (prevalence_b) to influence the outcome, behaviour. Prevalence_b is of length N, so perhaps that’s my issue, I need to adjust the transformed parameters block to reflect this?

Or perhaps I should include the person level predictor directly into the model block?

Based on what you said, I think my issue may pertain to how i’m passing that data to Stan.

Here is my current R code:
behaviour ← DATA$behaviour
behaviour ← as.vector(behaviour)
respondent ← as.factor(DATA$respondent)
prevalence ← DATA$prevalence
prevalence_b ← DATA$prevalence_b

data_varying_slope ← list(N = length(as.integer(behaviour)), behaviour = behaviour, prevalence = prevalence, prevalence_b = prevalence_b, K = nlevels(respondent), respondent = as.integer(respondent))

If so, I’m not sure where i’m going wrong. prevalence_b is of length N, but the same value is repeated for every row within the person (clustering factor).

Yep that’s going to be an issue (I don’t think its necessarily the main issue, but it might be contributing).

respondent_b should be length K, where the elements are the means of the each of the K groups, rather than those elements repeated for the individuals of each group

1 Like

I’ve estimated a similar hierarchical model without any upper-level covariates (e.g., no prevalence_b) and it works fine. I think my problem primarily pertains to the mismatch in dimensions, as you’ve highlighted @andrjohns. I’m specifying prevalence_b as of length N, when in fact it should be of length K (a single value for each level of prevalnce_b (ie. a single value for each group; see table below).

Respondent ID (K) Prevalence Behaviour Prevalence_B
1 3 2 2
1 1 4 2
1 4 1 2
2 2 2 6
2 5 4 6
2 6 5 6
3 5 2 3
3 2 1 3
3 1 1 3

However, when I change it to length K in the parameters block, i get the following error message, because it has the same number of dimensions as N, just repeated several times in the data.frame for each row within the group.
Do you know how I would communicate this R / Rstan. I’m at a bit of a lost end with it.

This is what i’ve currently got:
data_varying_slope_interaction ← list(N = length(behaviour),
behaviour = as.integer(behaviour),
prevalence = prevalence,
K = nlevels(respondent),
respondent = as.integer(respondent),
trust_sci = trust_sci,
field_nattech = as.integer(field_nattech),
field_artshum = as.integer(field_artshum),
field_medhealth = as.integer(field_medhealth),
field_artshum = as.integer(field_artshum),
career_stage = career_stage)

I’m not sure how I adjust this to fix the dimensions issue.

You just need to pass prevalence_b as a vector with K entries, one per unique respondent.

As an example, let’s say you had 5 unique individuals and their means:

> base_means = data.frame(Respondent = 1:5, Prevalence_b = rnorm(5))
> base_means
  Respondent Prevalence_b
1          1   -0.4601391
2          2    0.1889967
3          3   -1.0508598
4          4   -1.2054561
5          5   -1.0516462

It’s these 5 means (or K means in your case) that you need to pass to Stan.

Following the structure of your dataset, if the individuals had been assessed repeatedly, so their mean scores were repeated:

> full_data = base_means[sample(1:5,10,replace=T),]
> full_data
    Respondent Prevalence_b
4            4   -1.2054561
4.1          4   -1.2054561
3            3   -1.0508598
3.1          3   -1.0508598
5            5   -1.0516462
5.1          5   -1.0516462
2            2    0.1889967
1            1   -0.4601391
5.2          5   -1.0516462
4.2          4   -1.2054561

You just need to extract the unique means, in the same order as the respondents:

> unique(full_data$Prevalence_b[order(full_data$Respondent)])
[1] -0.4601391  0.1889967 -1.0508598 -1.2054561 -1.0516462

Which you can see are the same values in the same order as the original.

So for your dataset you would run:

data_varying_slope_interaction <- list(
  ...
  prevalence_b = unique(prevalence_b[order(respondent)]),
  ...
)
1 Like

That’s great, thank you! The only problem is now the dimensions are lower than they should be. Using the code you provided, it seems to be only allowing a single unique value from the whole dataset. So, if two respondents (groups) have the same prevalence_b score, it only takes one of them. It doesn’t allow me to have the same score for each person if in fact two respondents do have the same score.

How do I specify that I want one unique score for each person?

It seems a little odd for two groups to have identical means, but putting that aside an alternative is to select the score from rows with distinct respondent IDs:

list(
...
prevalence_b = prevalence_b[!duplicated(respondent)]
...
)
1 Like

The data is truncated (I should probably specify this in the model, actually), and there are approx 4500 respondents (groups) with only 9 repeated measurements within each persons, so there are lot of groups where the means are identical.

Thanks for your help with that - the dimensions issue has now been fixed. Sadly, i’m still getting the following error message:

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
error occurred during calling the sampler; sampling not done

Can you post more of the output? Everything from the stan call to the error ideally