Reason for Stan error: error occurred during calling the sampler; sampling not done

Hi there! I am trying to estimate the stan model below in Rstan:

data {
  int<lower=0> N; // Number of patents
  int<lower=0> K; // Number of firms
  int<lower=0> P; // Number of covariates
  int<lower=1,upper=K> firm[N]; // Firm index for each patent
  matrix[K, P] W; // Firm-specific covariates
  vector[N] c; // Number of citations
}

parameters {
  vector[K] mu_k;
  vector<lower=0>[K] sigma_yk;
  vector[K] t_k;
  vector[N] y;
  real mu;
  real mu_t;
  vector[P] beta_mu;
  vector<lower=0>[2] theta;
  cholesky_factor_corr[2] Omega;
  real<lower=0> sigma_c;
}

model {
  vector[N] p_ik;
  
  // Priors
  mu ~ normal(0, 1000);
  mu_t ~ normal(0, 1000);
  beta_mu ~ normal(0, 1000);
  theta ~ cauchy(0, 10);
  Omega ~ lkj_corr_cholesky(3);
  sigma_yk ~ uniform(0, 1000000);
  sigma_c ~ uniform(0, 1000000);
  
  // Model equations
  for (k in 1:K) {
    vector[2] mu_tk = [mu + W[k] * beta_mu, mu_t]';
    matrix[2, 2] V = quad_form_diag(Omega, theta);
    [mu_k[k], t_k[k]]' ~ multi_normal(mu_tk, V);
  }
  
  for (i in 1:N) {
    int k = firm[i];
    y[i] ~ normal(mu_k[k], sigma_yk[k]);
    p_ik[i] = y[i] >= t_k[k] ? 1.0 : 0.0;
    target += normal_lpdf(c[i] | y[i], sigma_c) * p_ik[i];
  }
}

And am getting the following error:

[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done

Any ideas as to why this may be happening? I am happy to provide more information as needed, but I’m guessing there is something wrong with how I’ve specified the model here. Given the vagueness of the error though, it’s been hard to figure out exactly what is wrong. Thank you in advance!

model mispecification - see this SO answer: r - Overcome the error: Initialization failed in rstan::sampling() - Stack Overflow.

1 Like

This is (at least partly) due to your uniform priors. If you specify a uniform prior for a parameter then your parameter declaration needs to have matching bounds.

So for your model you need to update your declarations to:

  vector<lower=0, upper=1000000>[K] sigma_yk;
  real<lower=0, upper=1000000> sigma_c;

But I say that with the caveat that both wide uniform priors and normal priors with such large SDs can often (but not always) be detrimental to both the model’s sampling as well as the quality of the estimates. If you still have initialisation (or other fitting) issues, then I’d recommend using a much smaller value for the normal SD - with respect to the expected scale of the parameter/data, of course

1 Like

Thanks!

Thank you!